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ABSTRACT 


Accurate  forecasting  of  Annual  Average  Daily   Traffic 

(AADT)  Is  vital  to  transportation  planning.  The  design  of 

roads  and  analysis  of   alternative   highway  projects   are 
dependent  on  these  forecasts. 

This  study  builds  on  previous  efforts  found  in  the 
field  of  rural  traffic  forecasting.  The  study  combines 
careful  statistical  analysis  with  subjective  judgment  to 
develop  models  that  are  reliable  and  easy  to  use.  This 
study  developed  two  different  kinds  of  models  —  aggregate 
and  disaggregate  —  to  forecast  traffic  volumes  at  rural 
locations  in  Indiana's  state  highway  network.  These 
models  are  developed  using  traffic  data  from  continuous 
count  stations  in  rural  locations,  and  data  for  various 
county,  state  and  national  level  demographic  and  economic 
predictor  variables.  Aggregate  models  are  based  on  the 
functional  classification  of  a  highway,  whereas  the 
disaggregate  models  are  location-specific.  These  models 
forecast  future  year  AADT  as  a  function  of  base  year  AADT, 
modified  by  the  various  predictor  variables.  The 
combination   of   aggregate   and   disaggregate   models  will 
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provide  reliable  traffic  forecasts.  The  number  of 
predictor  variables  employed  In  the  models  was  kept  to  a 
minimum.  The  statistical  analysis  also  found  that  the 
predictor  variables  are  statistically  significant;  no 
other  variables  will  provide  significant  predictive   power 

to  the  models.   The  modelB  developed  in  this  study  provide 

2 
higher  R   values  than  those  found  in  the   literature,   and 

more   refined   statistical  techniques  reinforce  the  choice 

of  variables  used  in  the  models.   A   six-Btep   process   to 

obtain   the   future   year  AADT  by  employing  both  aggregate 

and  disaggregate  models  is   presented   to   assist   in   the 

models'  implementation. 


CHAPTER  1 


INTRODUCTION 


1  .  1  Int  roduct ion 


Among  the  most  important  factors  in  public  investment 
decisions  is  the  projected  demand  for  an  existing  or 
proposed  facility.  The  pattern  of  traffic  growth  and 
projected  traffic  volumes  have  been  recognized  as  prime 
factors  in  most  analyses  of  highway  projects.  Developing 
future  traffic  estimates  is  not  an  exact  science, 
dependent  as  it  is  on  so  many  hard-to-predict  variables. 
The  traffic  growth  factor  has  a  significant  effect  on 
highway  investment  decisions  pertaining  to  increasing  the 
capacity  of  existing  highways  and  the  construction  of  new 
facilities,  when  limited  funds  are  available.  Traffic 
forecasting  procedures  oust  be  reasonably  easy  and 
economical  to  carry  out,  be  sensitive  to  a  wide  range  of 
policy  issues  and  alternatives,  and  produce  Information 
useful  to  decision-makers  In  a  form  that  does  not  require 
extensive  training  to  understand. 


Estimates  ot  future  traffic  could  be  arrived  at  by 
two  very  different  nethods:  projections  and  forecasts. 
Projections  have  been  U6ed  for  years  and  are  based  on  a 
historical  record  of  the  desired  data  item.  Trend  lines 
drawn  through  prior  year  data  observations  are 
extrapolated  to  the  target  year.  In  6ome  cases  these 
extrapolated  trends  are  modified  by  the  analyst  based  on 
his  experience  and  knowledge  of  the  route,  state  or 
region.  Whereas  with  projections  we  are  dealing  only  with 
the  traffic  data,  forecasting  techniques  are  concerned 
with  predicting  the  future  values  of  economic  and  other 
measures  or  indicators  of  person  and  vehicle  travel.  In 
forecasting  techniques,  a  relationship  between  traffic  and 
associated  factor(s)  is  established. 

1.2  Background 

Traffic  data  are  essential  in  nearly  every  step  of 
the  planning  process.  In  highway  Investment  (major 
maintenance,  reconstruction  or  new  construction),  a 
reliable  estimate  of  future  traffic  volume  is  a  key 
element  . 


Traffic  forecasts  can  be  prepared  with  a  variety 
approaches,  depending  on  whether  the  forecast  refers  to  an 
urban  or  rural  area.  In  urban  areas,  forecasts  are 
generally  based  on  the  four-step  (trip  generation  model, 
trip  distribution  model,  modal   split   model   and   traffic 


assignment  model)  t ravel -s imul a t ion  process  [21  ,38]  -  In 
these  cases  travel  on  the  road  network  is  an  output  of  the 
assignment  process.  Most  large  metropolitan  areas  have 
developed  and  implemented  a  fairly  sophisticated  set  of 
computer-based  travel  simulation  models  based  on  the 
traditional  four-step  process.  In  rural  areas,  when 
assignment-based  models  do  not  exist  or  are  not  practical 
to  apply,  traffic  estimates  are  generally  made  by 
expanding  present  traffic  into  the  future  based  on 
projections  of  population,  employment,  vehicle 
registration,  land-use  data,  or  other  parameters 
121,32,38]. 

1 . 3  Past  Research 


Traffic  forecasting  in  urban  areas  has  been 
extensively  explored  and  the  forecasting  methodologies, 
mainly  based  on  sophisticated  computer  modeling  programs, 
are  highly  advanced.  On  the  other  hand,  forecasting 
traffic  for  individual  rural  roads,  even  though  widely 
practiced,  is  still  in  its  early  6tages.  Standardized 
methodologies  for  nationwide  use  have  not  been 
established,  and  state  authorities  develop  their  own 
procedures  to  accommodate  their  needs.  One  of  the  reasons 
for  the  development  of  different  procedures  by  different 
state  authorities  might  be  that,  since  the  development  of 
traffic  projections  is  not  an  exact  science,  planners  base 


their  methods  on  different  conceptual  models  and  chub  use 
different  procedures  to  reduce  the  uncertainty  associated 
with  their  projections.  Methods  of  traffic  forecasting 
were  advanced  during  the  mid-sixties  when  statewide 
transportation  studies  were  conducted  by  many  states  to 
fulfill  the  need  for  developing  final  statewide 
transportation  plans.  Traffic  forecasting  was  a  basic 
input  for  these  studies. 

The  various  state  departments  of  highways  developed 
their  own  methods  to  forecast  rural  traffic,  but  very  few 
are  well  documented.  The  following  sections  will  present 
some  of  these  studies  as  they  relate  to  rural  traffic 
forecasting  . 

1.3.1  Traffic  Growth  Trends  on  Rural  Highways 


In  1956,  Mori  and  Houska  [36],  in  their  study  of  the 
Illinois  rural  highway  network,  came  to  the  conclusion 
that  the  four  factors  responsible  for  traffic  growth 
patterns  were  (1)  geographic  location,  (2)  type  and  width 
of  pavement,  (3)  proximity  to  an  urban  area  and  (A)  type 
of  service  the  roadway  provides.  They  observed  that  growth 
was  assumed  to  tske  the  form  of  an  S-shaped  curve,  as 
shown  In  Figure  1.1,  with  3  stsges  of  development  —  (1) 
increasing  growth  rate  (1st  stage),  (2)  constant  growth 
rste  (2nd  stage),  and  (3)  decreasing  growth  rate  (3rd 
stsge  ). 
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Figure  1.1:  General  Growth  Concept 


They  observed  that  truck  traffic  on  rural  primary 
highways  was  Increasing  at  a  faster  rate  than  passenger 
car  traffic.  Their  atudy  also  Indicated  that  population  is 
the  principal  component  that  affects  the  trend,  followed 
by  persons  per  vehicle  (or  it  could  be  expressed  directly 
•a  number  of  vehicles)  and  gallons  of  gasoline  or  vehicles 
■ilea  per  vehicle  for  rural  roads  of  Illinois. 


1.3.2  Simplified  Elasticity-Based  Procedure 

In  1982,  Neveu  [38]  developed  a  set  of  elasticity- 
based  models  to  forecast  rural  traffic.  The  models  fore- 
casted future  year  AADT  as  a  function  of  base  year  Annual 
Average  Daily  Traffic  (AADT),  modified  by  various  demo- 
graphic factors.  Neveu  claimed  that  the  type  of  service 
the  roadway  provides  (interurban,  interregional,  rural  to 
urban,  urban  to  rural)  is  the  only  factor  that  had  an 
appreciable  effect  on  traffic  growth  rates.  Multiple 
linear  regression  was  used  to  identify  factors  that  best 
estimated  AADT  and  their  respective  elasticities.  Three 
classes  of  roadway  were  used,  as  was  done  by  Morf  and 
Houska  [36].  The  background  factors  examined  are  popula- 
tion, number  of  households,  automobile  ownership  and 
employment.  These  data  are  collected  at  town,  county  and 
state  level.  Neveu  eliminated  the  income  variable  because 
of  the  difficulty  in  forecasting  future  values  and  found 
that  the  number  of  households  is  a  better  determinant  of 
travel  than  population.  Each  of  Neveu's  models  is  rela- 
tively simple,  with  only  one  or  two  independent  or  predic- 
tor variables.  The  ultimate  result  of  his  study  is  a  set 
of  nomographs  that  give  quick  estimates  of  the  growth  fac- 
tor, I.e.,  the  elasticity  portion  of  his  model. 


The  data  used  for  Neveu's  statistical  analyses  were 
those  of  the  year  1974  to  1978  (a  total  of  only  5  observa- 
tions for   each   station),   In   an   effort   to   avoid   any 


complications  from  the  energy  crisis  of  the  preceding 
years.  The  background  data  were  collected  for  each  sta- 
tion according  to  the  town  or  county  in  which  the  station 
was  located.  The  roads  were  classified  according  to  the 
type  of  service  they  provide:  (a)  Interstates,  (b)  Prin- 
cipal Arterials,  and  (c)  Minor  Arterials  and  Major  Collec- 
tors.  The  R  values  0.65,  0.77  and  0.20  for  road  types 
(a),  (b)  and  (c),  respectively,  give  an  indication  of  the 
explanatory  power  of  the  data.  For  Interstates  and  Princi- 
pal Arterials,  the  association  of  AADT  with  the  background 
variables   is   much   better   than   for  Minor  Arterials  and 

Major  Collectors.   For  the  Minor  Arterials  and  Major   Col- 

2 
lectors,   the   low  R   indicates  the  poor  explanatory  power 

of  the  variables  used.   The  author   identifies   two   major 

problems  associated  with  the  model:   (i)  The  difficulty  in 

obtaining  projections   of   the   background   variables   and 

their   questionable  accuracy  at  the  level  they  are  needed, 

and  (ii)  the  difficulty  in  deciding  the   applicability   of 

the   model  in  certain  areas  (i.e.,  whether  a  specific  area 

is  "rural  enough"  for  the  model). 

Neveu  used  multiplicative  constant  elasticity  in  his 
model.  While  this  specification  possesses  conceptual  and 
statistical  advantages,  it  does  have  an  inherent  weakness 
that  should  be  carefully  considered  128].  This  weakness 
results  from  the  constant  elasticity  structure,  which 
implies  that  the  effect  of  the  growth  in  demand  on  traffic 
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growth  always  has  been  and  will  remain  Che  same.  The  con- 
stant elasticity  model  cannot  be  used  to  forecast  for  more 
than  •  very  limited  numbers  of  years  at  a  time.  The  result 
is  that  if  the  model  is  estimated  during  a  period  of  high 
growth  rate,  future  traffic  will  be  overestimated  and  vice 
versa.  Thus,  when  such  models  are  used,  they  are  recali- 
brated as  often  as  practicable  in  order  to  ensure  that  a 
correction  in  the  traffic  growth  rates  is  made  and,  there- 
fore, the  margin  of  error  is  limited.  Models  with  vari- 
able elasticities  are  not  very  common  in  traffic  forecast- 
ing. Such  model  structures  Involve  more  sophisticated  and 
expensive  analysis. 

1.3.3  Trend  Ana  1 y s i s -Bas ed  Procedure 


The  Minnesota  Department  of  Transportation  (hn/DOT) 
[35]  computes  a  rou t e -spe ci t i c  growth  factor  from  a  trend 
analysis  of  the  specific  route.  To  determine  the  current 
or  base  year  AADT,  48-hour  weekday  machine  counts  are 
taken  and  adjusted  using  FHWA  procedures  [18J.  After 
determining  base  year  AADT,  10-20  years  (preceding  to  the 
base  year)  of  AADT  counts  are  taken  from  traffic  flow 
maps.  It  has  been  recognized  that  location  of  the  count 
stations  on  the  flow  map  can  be  different  from  the  previ- 
ous year's  count  stations,  primarily  due  to  change  of  cor- 
porate limits  of  towns.  By  linear  regression,  a  line  is 
fitted  to  the  data  and  that  line  is  extended  to  the  design 


year.  The  overall  growth  Is  then  the  difference  between 
design  year  AADT  and  base  year  AADT.  Similar  graphical 
plots  of  AADT  against  time  for  all  (or  several)  major 
highway  segments  are  done  along  the  proposed  project.  If 
the  growth  rates  are  uniform,  a  single  rate  can  be  applied 
to  the  entire  project.  If  not,  the  forecaster  then  must 
use  judgment  in  selecting  the  appropriate  rate  for  each 
segment  based  on  his  knowledge  of  the  project  area. 

1.3.4  Disaggregate  Analysis  of  Heavy  Commercial  Traffic 


The  New  Mexico  State  Highway  Department  [2]  has 
designed  a  procedure  for  forecasting  Heavy  Commercial  (HC) 
and  Average  Daily  Traffic  (ADT)  traffic  on  the  New  Mexico 
Interstate  system  and  then  calculating  the  percent  HC 
traffic.  This  process,  and  the  computer  program  developed 
from  it,  is  called  Trend-line.  Trend-line  identifies 
fourteen  distinct  heavy  commercial  truck  sectors  (geo- 
graphical) on  the  New  Mexico  Interstate  system.  Separate 
forecasting  models  were  developed  for  each  sector.  The 
disaggregate  analysis  (a  separate  analysis  for  each  sec- 
tor) provides  a  better  traffic  projection  as  opposed  to 
aggregate  analysis  (all  sectors  together).  Trend-line 
analysis  includes  the  national,  state  and  local  socio- 
economic Indicators  that  affect  heavy  commercial  traffic 
on  the  New  Mexico  Interstate  highways.  Eight  key  demo- 
graphic and  economic  indicators  are  identified: 
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1.  United  States  Average  Gasoline  Cost  Per  Gallon. 

2.  United  States  Disposable  Personal  Income. 

3.  New  Mexico  Population. 

4.  New   Mexico   Residential   Building   Permits,    Dollar 
Value  . 

5.  United  States  Consumer  Price  Index. 

6.  United  States  Producer  Price  Index. 

7.  New  Mexico  Civilian  Employment. 

8.  New  Mexico  Retail  Trade. 


SAS  (Statistical  Analysis  System)  multivariate 
analysis  —  more  than  one  dependent  variable  in  the 
analysis  —  was  conducted  using  these  indicators  as 
independent  variables.  A  series  of  best  fit  equations  was 
developed,  and  percent  heavy  commercial  of  average  daily 
traffic  was  forecasted  for  a  twenty-year  period. 

The  Trend-line  sectors  showed  different  percent  heavy 
commercial  traffic  for  the  most  recent  year  and  led  Co 
development  of  separate  models  for  the  fourteen  separate 
sectors.  The  state  frequently  uses  an  assumption  Co  limit 
XHC  to  30  percent  of  ADT.  A  regression  equation  chac 
resulted  ~-ln  percent  HC  over  30  percent  of  ADT  was 
defaulted  back  to  30  percent  level. 
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In  multivariate  analysis,  HC  and  ADT  were  taken  as 
dependent  variables  and  regressed,  using  socio-economic 
characteristics  as  independent  variables.  The  socio- 
economic variables  were  identified  on  the  national,  state, 
county,  and  local  level. 

Once  a  potential  indicator  to  estimate  traffic  was 
suggested,  it  was  reviewed  in  several  ways.  First,  it  was 
critiqued  on  the  basis  of  its  theoretical  applicability: 
How  could  the  indicator  be  related  to  HC  or  ADT?  The  list 
of  possible  indicators  was  narrowed  through  this  review. 
The  indicators  were  then  reviewed  in  several  other  ways: 
the  availability  of  accurate  information  and  the  period  of 
data  reporting  and  updates. 


Chi-square  analysis  demonstrated  that  ADT  on  the  New 
Mexico  Interstate  was  significantly  associated  with 
changes  in  state  population.  The  standard  technique  of 
population  forecasting,  Cohort  Analysis,  was  used  for 
population  forecasting  and  a  computer  program  (7]  was 
written  to  interface  with  the  Trend-Line  HC  and  ADT 
analysis.  Cohort  Analysis  is  the  process  of  dividing  the 
population  into  age  groups,  and  then,  each  year,  each  age 
group  graduates  a  portion  into  the  next  age  group,  all  the 
babies  born  are  added  into  the  first  age  group,  the  dif- 
ferent age  group  death  rates  are  applied,  and  the  net  in- 
tegration" is  added  to  forecast  the  next  year's  population. 
This  procedure  is  used  because  different  age   groups   have 
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different  birch  and  death  rates. 

In  the  statistical  analysis,  linear  regressions  were 
conducted  using  Heavy  Commercial  ADT  (HCADT)  and  ADT  as 
dependent  variables.  Six  years  of  historical  data  were 
used.  The  first  models  were  multiple  regression  analyses 
of  HC  and  ADT  by  year.  Then  multivariate  analyses  were 
done   with   eight   independent   variables.   All  regression 

analyses  were  conducted  to  find   the   best   fit   equation. 

2 
All  equations  had  an  R   value  of  over  80  percent. 

1.4  Scope  of  the  Research 

The  purpose  of  this  research  study  is  to  develop  a 
method  of  establishing  rural  traffic  growth  factors  that 
can  be  used  by  the  Indiana  Department  of  Highways  (IDOH). 
The  research  is  being  carried  out  by  the  Joint  Highway 
Research  Project  (JHRP)  at  Purdue  University  with  the 
sponsorship  of  the  Federal  Highway  Administration  (FHWA) 
and  IDOH. 


The  proposed  method  will  be  based  on  the  background 
Input  factors  for  which  clear  relationships  and  usable 
forecasts  exist  and  will  continue  to  exist.  Moreover,  the 
proposed  method  must  be  reliable,  well-documented  and 
flexible.  The  model  to  be  developed  In  this  study  will  be 
simple  to  apply.  A  hand  calculator  will  be  adequate  for 
the  application  of  the  model,  making  the   traffic   projec- 
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tions  for  any  year  easy  to  compute. 

The  primary  focus  of  this  study  was  the  design  and 
testing  of  a  simple,  fast  method  to  forecast  rural  traffic 
volumes  and  step-by-step  instructions  on  its  use.  This 
report  details  the  development  of  such  a  procedure  in 
order  to  update  the  method  in  future  years.  This  study 
examines  previous  efforts  aimed  at  forecasting  rural 
traffic,  describes  the  chosen  methodology,  and  presents 
the  results  of  the  analysis.  Finally,  some  of  the  limita- 
tions of  the  procedure  are  discussed,  and  some  possible 
solutions  to  the  limitations  are  provided. 

1  .  5  Report  Organization 

This  report  consists  of  six  chapters  and  seven  appen- 
dices. Chapter  2  discusses  the  literature  review  in  the 
light  of  forecasting  rural  traffic  and  the  current  pro- 
cedures practiced  by  some  state  highway  departments,  as 
discussed  in  Chapter  1. 

Chapter  3  addresses  the  problem  of,  and  overall 
methodology  for,  constructing  statistical  models.  Chapter 
4  describes  the  variables  in  the  data  tables  and  their  use 
in  regressions . 


The  analysis  of  the  data  gathered  in  Chapter  4  is 
provided^  in  Chapter  5.  Statistical  reliability  tests  are 
discussed  in  the  preliminary  analyses  with  their   results. 
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Based  on  Che  results  of  preliminary  analysis,  two  types  of 
models  (aggregate  and  disaggregate)  are  presented  in 
Chapter  5  for  different  categories  of  highways.  Chapter  5 
also  presents  the  performance  of  both  types  of  models, 
using  the  data  not  included  in  the  models  development. 
Chapter  6  gives  the  summary  and  conclusions  of  the 
research  as  veil  as  steps  for  implementation  of  the  models 
developed.  This  chapter  also  provides  probable  problems, 
limitations  and  suggestion  to  overcome  the  problems. 


The  data  tables  for  aggregate  analysis  developed  and 
analyzed  in  Chapter  4  and  5  are  presented  in  Appendix  A. 
It  is  believed  that  this  presentation  will  help  in  future 
modification  of  the  model,  if  desired.  Appendices  B  and  D 
present  the  scatterplots  of  the  dependent  variable,  Annual 
Average  Daily  Traffic  (AADT),  against  the  independent 
variables  selected  for  aggregate  and  disaggregate 
analysis.  Appendices  C  and  E  present  the  residual  plots 
of  the  selected  variables  in  the  aggregate  and  disaggre- 
gate analysis.  Appendix  F  presents  four  example  plots  of 
simple  extrapolation.  Appendix  G  provides  the  statistical 
test  to  determine  the  equality  of  two  population  means 
with  an  example  . 
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CHAPTER  2 


LITERATURE  REVIEW 


2  . 1  Int  roduc  t ion 

This  chapter  presents  a  review  of  the  literature  on 
traffic  forecasting,  with  particular  emphasis  on  rural 
traffic  forecasting  procedures.  Some  of  the  currently 
used  rural  traffic  forecasting  procedures  by  certain  state 
highway  departments  were  discussed  in  Section  1.3.  A 
review  of  the  literature  reveals  that  limited  research  has 
been  accomplished  on  the  topic  of  forecasting  traffic 
growth  factors  in  the  context  of  rural  highways.  Some 
ideas  from  this  review  study  have  been  incorporated  in  the 
present  study. 

2.2  Transport  at  ion  Demand  Models 


The  process  of  relating  the  demand  for  transportation 
to  the  socioeconomic  activities  that  generate  It  is  known 
as  transportation  demand  analysis  [28].    The   results   of 
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this  analysis  are  relationships  (often  In  the  form  of 
■odels)  between  measures  of  activity  and  measures  of 
transport  demand.  Such  relationships  are  often  referred 
to  a  8  transportation  demand  models.  Although  demand 
analysis  is  distinct  from  traffic  forecasting,  one  can  use 
the  results  of  demand  analysis  to  forecast  future  traffic 
volumes.  The  demand  models  provide  a  major  input  Into  the 
forecasting  process.  It  should  be  recognized  that  there 
are  limitations  of  demand  models  as  forecasting  tools. 
The  strength  in  forecasting  is  not  In  the  models  or 
procedures  used,  but  in  the  methodology  applied  and  In  the 
logic  used  to  project  exogenous  factors.  The  analyst  might 
well  find  it  reasonable  to  use  models  of  demand  analysis 
for  short-term  forecasting  in  order  to  study  the  Impacts 
of  changes  in  the  demand  and  supply  environments  of 
transportation.  But  as  the  term  of  forecasting  becomes 
longer,  It  is  unlikely  that  the  same  models  will  continue 
to  be  of  as  much  relevance. 

2.3  Background  Factors  for  Ru  r a  1  Traffic  Forecast 

2.3.1  How  Background  Factors  Affect  Traffic 


Memmott  [32,33]  showed  the  Impact  of  different 
traffic  growth  rates  on  the  estimate  of  future  benefits 
from  a  proposed  project,  as  well  as  the  factors  that 
affect   traffic   projection  errors.  These  factors  Included 
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the  year  the  projection  was  made,  the  percentage  of 
commercial  and  industrial  land  development,  and  changes  in 
highway  capacity.  Memmott  also  presented  a  simple  model 
for  projecting  future  traftlc  volume  that  Is  based  on  a 
multiple  regression  analysis  of  historical  traffic  volume 
data  and  adjustments  for  capacity  changes  and  land 
development . 

In  1980,  Hartgen  [20]  introduced  the  concept  of 
adjustment  factors  to  base  line  forecasts  of  traffic  to 
account  for  various  additional  concerns  that  had  not 
previously  been  considered,  or  for  which  the  previous 
assumptions  were  no  longer  valid.  He  recommended  dealing 
with  the  urban  and  rural  contexts  separately.  Among  the 
aspects  considered  were  changes  in  energy  supply  and 
price,  auto  ownership  and  use,  households,  employment  and 
labor  force,  population,  inflation,  ridesharing,  transit, 
and  average  auto  fuel  efficiency.  He  also  discussed 
probable  range  of  forecast  errors. 

2.3.2  Role  of  Background  Factors 


Covault  [14]  considered  the  impact  of  growth  trends 
in  population,  motor  vehicle  registration,  motor  vehicle 
use,  and  motor  fuel  consumption  on  traffic  growth.  Hartgen 
[20]  urged  that,  in  nonurban  areas  where  assignment-based 
modeling  does  not  exist  or  may  not  be  appropriate, 
estimates   are  generally  made  by  extending  present  traffic 
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volumes  into  the  future  by  using  projections  of 
population,  number  of  households,  cars,  employment,  county 
or  town  vehicle  miles  of  travel  (VMT)  or  other  parameters. 
The  approach  taken  by  Hartgen  [20]  is  to  develop 
adjustment  factors  based  on  empirical  evidence  and  travel 
elasticities  those  are  applied  to  base  line  forecasts  to 
obtain  estimates.  The  factors  that  will  Influence  travel 
are  auto  efficiency,  gasoline  price,  population,  energy 
supply  cutoffs,  inflation,  employment,  number  of 
households,  urbanization,  automobile  ownership  and  use, 
etc . 

Salovara  et  al.[A4]  examined  the  Impacts  of 
background  factors  affecting  car  ownership,  to  prepare  a 
forecast  of  traffic  and  the  number  of  motor  vehicles.  The 
forecasts  were  compiled  from  three  scenarios  (growth, 
adaption  and  crisis)  based  on  different  international  and 
national  economic  situations. 


Mckay  [31],  in  his  work  with  Cook  County  and  the  City 
of  Chicago,  observed  a  close  relationship  between 
population  per  square  mile  and  the  amount  of  traffic  using 
the  highways.  He  found  that  population  decreases  rapidly 
with  the  Increase  in  distance  from  the  city  of  Chicago. 
The  volume  of  highway  traffic  on  each  route  also  decreases 
rapidly  with  the  increase  in  distance  from  the  city.  The 
relation  between  population  and  highway  traffic  indicates 
the  necessity  of   considering   population   trends   in   the 
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formulation  of  a  highway  Improvement.  program.  The 
prediction  of  expected  future  traffic  on  the  projection  of 
the  trend  of  motor  vehicle  registration  Is  a  reasonably 
accurate  Indication  of  future  highway  traffic. 

Magridge  [29]  used  forecasting  car  ownership  as  a 
technique  to  forecast  traffic.  The  conversion  of  a  car 
ownership  forecast  to  a  traffic  forecast  was  treated  as 
the  main  problem.  He  used  two  Important  techniques,  time 
series  analysis  and  cross-section  analysis,  to  forecast 
car  ownership.  The  basic  assumption  in  a  cross-section 
analysis,  as  compared  with  a  time  series  analysis,  is  that 
there  is  a  stable  relationship  between  car  ownership  and 
income.  In  a  subsequent  article,  Magridge  [30]  was  mainly 
concerned  with  car  purchases  and  car  use.  Magridge 
suggests  that  while  the  growth  of  car  ownership  appears 
likely  to  continue,  the  level  of  car  traffic  arising 
therefrom  is  much  more  sensitive  to  policy  on  taxation  and 
service  levels.  The  major  determinants  of  car  ownership 
are  considered  to  be  income  and  car  prices,  but  not  fuel 
prices  . 

2.4  Time  Series  Forecast  of  Traffic 


Benjamin  [6]  used  time  series  analysis  to  forecast 
future  traffic.  Time  series  analysis  uses  a  logistic 
function  In  which  model  parameters  are  estimated  by 
ordinary   least   squares.   The   logistic   function   cannot 
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accounts  for  sudden  shifts  in  behavior  or  changes  in 
transportation  network,  but  it  can  provide  estimates  of 
future  trends  when  network  changes  are  small.  Time  series 
analysis  uses  land  use  development  as  the  starting  point 
to  formulate  the  theory  of  traffic  growth.  Traffic  volume 
is  treated  as  a  function  of  time  and,  as  time  passes,  more 
land  is  developed  snd  traffic  increases  proportionally. 
Land  use  is  initially  stable  when  the  land  is 
agriculturally  zoned.  As  land  is  developed,  traffic 
increases  until  all  land  in  the  zone  or  corridor  16 
developed.  At  this  point  in  time,  traffic  stabilizes. 
Traffic  volume  thereafter  remains  about  the  same, 
increasing  or  decreasing  by  small  percentages  based  on 
variations  in  fuel  supply,  population  density,  driving 
habits  and  land  use.  The  greater  the  land  available,  the 
greater  the  potential  for  development.  Once  most  land  is 
developed,  there  is  little  room  for  further  development, 
so  trsfflc  growth  must  be  slow. 


The  growth  factor  in  time  series  analysis  will  be 
inversely  proportional  to  the  degree  of  land  developed. 
The  time  series  method  of  traffic  forecasting  is  simpler 
snd  more  economical  than  the  other  demand  forecasting 
procedure  and  is  recommended  where  lsnd  use  is  stsble. 
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2.5  An  Appllcat Ion  of  The  Logistic  Traf  f lc  Growth  Mode  1 

Taliadoros  [A6]  used  a  logistic  growth  model  to 
estimate  parameters  to  forecast  traffic  at  ten  continuous 
traffic  count  stations  in  Indiana.  He  adopted  the  S- 
shaped  concept  of  Morf  and  Houska  136].  Taliadoros 
claimed  that  his  procedure  Is  simple,  fast  and  easily 
calibrated  with  updated  Input  data.  The  model  he  developed 
uses  a  mathematical  procedure  to  estimate  the  limiting  or 
maximum  AADT  and  assumes  that  the  S-curve's  Inflection 
point  is  a  constant  proportion  of  the  limiting  AADT  for 
all  stations.  This  study  asserted  that  traffic  data  alone 
can  provide  reasonable  predictions.  It  did  not  take  into 
account  any  socio-economic  variables  and  thus  avoided  the 
impact  of  inaccurate  projections  of  these  variables.  The 
study  does  not  predict  temporary  fluctuations  in  traffic 
growth,  but  only  intends  to  project  the  overall  growth 
pattern  at  each  station. 

2.6  Statewide  Vehicle  Counting  Program 

Chen  [12]  proposed  an  improved  method  for  statewide 
vehicle  counting  program  for  Indiana  with  the  help  of 
statistical  theory.  The  method  is  applicable  to  rural  and 
suburban  roads  carrying  500  or  more  vehicles  per  day. 
Ritchie  [43]  also  used  a  statistical  approach  for  a  better 
statewide  traffic  counting  program  for  California.  Both 
of  these  studies  provide  estimates  of  AADT   that   are   the 
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basis  for  computing  present  year  traffic  in  forecasting 
techniques.  These  estimation  procedures  are  based  on  the 
FHWA  Guide  for  Traffic  Volume  Counting  Manual  [18]«  The 
data  from  the  automatic  traffic  records  are  used  Co 
develop  AADT  values  and  monthly  adjustment  factors  for  the 
continuous  count  stations. 

Dru6ch  [16]  proposed  a  traffic  counting  program  to 
estimate  AADT,  which  is  similar  to  the  Chen  [12]  study. 
Both  of  them  used  the  FHWA  method  of  grouping  stations  to 
convert  coverage  counts  to  AADT.  Traffic  counts 
corresponding  to  24-  or  4 8-con6 e cu t i ve -hou r s  from  mid 
Monday  to  old  Friday  are  known  as  coverage  counts. 
Coverage  counts  are  defined  as  single  observation  that, 
through  the  application  of  factors  can  be  expanded  to  the 
AADT.  1TE  Committee  6-1  [27]  looked  at  estimating  AADT  on 
low  volume  roads  (less  than  200U  vpd  )  .  The  basis  of  the 
Chen  study  is  Petroff  and  Blensly's  work  [40]  on  improving 
traffic  count  procedures  by  application  of  statistical 
methods.  Petroff  [41]  earlier  had  developed  some  criteria 
for  scheduling  mechanical  traffic  counts,  which  were  used 
later  for  other  studies. 


The  expansion  factor  (adjustment  factors)  for 
adjusting  coverage  counts  to  AADT  estimates  are  group  mean 
values  of  monthly  adjustment  fsctors.  The  procedures  for 
estimating  AADT  volumes  used  by  the  Indiana  State  Highway 
department,  based  on  the  FHWA  "Guide   for   Trsfflc   Volume 
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Counting  Manual"  [18],  are: 

1.  A  monthly  adjustment  factor  Is  computed  for  each 
continuous  count  station  for  each  month.  It  is  the 
ratio  of  the  AADT  to  the  monthly  average  weekday 
traffic.  The  monthly  average  weekday  traffic  is 
computed  from  all  the  weekdays  except  Fridays  in  a 
month  for  the  continuous  count  station. 

2.  The  24-hour  averages  of  the  48-hour  coverage  counts 
are  calculated.  The  48-hour  coverage  counts  are 
taken  on  weekdays,  usually  between  noon  Monday  and 
noon  Friday. 

3.  All  the  continuous  count  stations  are  grouped  as  per 
the  "Guide"  without  considering  functional  grouping. 
The  grouping  steps  are  outline  below: 

a.  Using  the  data  for  the  previous  year  arrange  the 
monthly  adjustment  factors  for  each  month  in 
ascending  order. 


b.  For  each  month  determine  a  set  of  stations  such 
that  the  difference  between  the  smallest  and  the 
largest  monthly  factor  does  not  exceed  the  range 
of  0.20  in  the  values  of  the  factors.  For  each 
month  determine  from  several  possible  sets  that 
set  having  the  largest  number  of  stations.  Such 
a  set  will  probably  not  be  the  same  for  each   of 
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the   twelve   months.   That  Is,  groupings  tend  to 
vary  from  month  to  month. 

c.  From  the  twelve  previously  determined  lets  , 
•elect  one  set  that  contains  those  stations 
common  to  all  the  twelve  sets.  In  addition  to 
these  stations,  a  few  additional  stations  are 
assigned  to  the  set,  though  they  have  factors 
that  are  outside  of  the  0.20  range  in  some 
months.  Investigations  have  shown  that  special 
conditions  can  cause  an  abnormal  change  in 
traffic  volumes  for  a  month  or  two  and  study  of 
the  data  for  previous  years  indicated  that  these 
added  stations  had  factors  that  would  have 
placed  them  within  the  set  determined  from 
current  data.  A  set  of  stations  determined  by 
such  a  procedure  is  called  a  group. 


d.  Steps  b  and  c  are  repeated,  considering  those 
stations  that  have  not  been  included  in  the 
first  group,  and  a  second  group  is  selected. 
Steps  b  and  c  are  repeated  a  number  of  times, 
until  only  those  stations  with  extreme  monthly 
adjustment  factor  values  remain  ungrouped. 
These  ststlons  are  placed  in  a  group  entiled 
"Special  Stations". 

4.   For  each  group,  compute  the  average   of   the   monthly 
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adjustment   factors   for  each  month  to  arrive  at  the 
group  mean  monthly  adjustment  factor.   Some   stations 

in    a   group,   however,  are   not   Included   in   the 

computation  of   the   mean  factor   for   a   particular 

month.    That   is,   those  stations   having   a  factor 

outside  the  0.20  range  of  the  group   for   that   month 
are  not  included . 

5.  The  group  mean  of  the  monthly  adjustment  factors  for 
each  month  is  used  as  an  adjustment  factor  that  would 
be  applied  to  24-hour  averages  of  48-hour  counts  on 
weekdays . 

6.  The  average  counts,  if  outdated  because  of  the  5-year 
cycle  used  in  obtaining  coverage  counts,  are  updated 
to  the  current  year  by  a  traffic  growth  factor 
determined  from  the  ATR  group  to  which  the  coverage 
counts  have  been  assigned. 

7.  The  updated  coverage  counts  are  multiplied  by  the 
same  year  mean  monthly  adjustment  factor  of  the  group 
to  which  the  coverage  counts  are  assigned  to  obtain 
an  estimated  AADT  for  the  roadway  section  where  the 
coverage  count  was  taken. 

2.7  Comments  on  Forecasting  Techniques 


Armstrong   [4,5],   in   his   studies   of   forecasting, 
concluded  that  sophisticated  extrapolation  techniques  have 
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had  a  negligible  payofi  for  accuracy  In  forecasting.  More 
aoph  1  s 1 1 ca t ed  methods  are  generally  lore  difficult  to 
understand,  and  they  cost  more  to  develop,  ■alntain,  and 
implement.  On  the  benefit  aide,  more  aoph  1  s 1 1 ca t e d 
methods  may  be  expected  to  produce  more  accurate  forecasts 
and  to  provide  a  better  assessment  of  uncertainty. 
However,  highly  complex  models  may  in  fact  reduce 
accuracy.  While  the  complex  models  may  provide  better 
fits  to  historical  data,  this  superiority  does  not 
necessarily  translate  into  better  forecasts.  The  danger 
la  especially  aerious  when  limited  historical  data  are 
available.  He  recommended  s i m p  1  e  methods  and  the 
combination  of  forecast  techniques.  The  combinations  may 
produce  significant  improvements  in  forecast  reliability. 
The  question  of  how  many  forecasts  to  combine  is ,  of 
course,  a  cost/benefit  Issue.  The  weights  of  different 
forecasting  method  may  arise  another  problem.  Armstrong 
suggested  starting  with  the  least  expensive  methodls) 
and/or  the  most  understandable  method(s),  and  then 
investing  in  successively  more  expensive  methods.  He 
suggested  use  of  methods  that  are  as  different  as 
possible,  and  simply  weight  each  forecast  equally.  He 
proposed  that  complexities  should  be  avoided  unless 
absolutely  necessary.  So,  simple  methods,  which  are  easily 
understood,  have  been  undertaken  to  develop  traffic  growth 
factor  models  in  this  study. 
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2.8  Definition  of  Functional  Classification  of  Highways 

The  definitions  of  the  functional  classifications  of 
rural  highways  (1,34]  used  in  this  study  are  presented 
be  low : 

1.  Rural  Interstate  :  Fully  controlled  access  facilities 
that  are  part  of  the  interstate  system.  The  major 
purpose  of  those  highways  is  to  provide  access  to  and 
between  urban  areas. 

2.  Rural  Principal  Arterial  ;  A  network  of  routes  with 
the  following  service  characteristics: 

(a)  Corridor  movement  with  trip  length  and  density 
suitable  for  substantial  statewide  or  Interstate 
travel  . 

(b)  Movements  between  all,  or  virtually  all,  urban 
areas  with  population  over  50,000  and  a  large 
majority  of  those  with  population  over  25,000. 

Thus,  highways  having  high  traffic  volumes,  serving 
the  longest  urban  trips  (one  end  in  an  urban  area), 
and  providing  access  to  major  activity  centers  fall 
in  this  category.  In  this  class,  service  to  abutting 
land  is  subordinate  to  the  movement  of  traffic. 


3.   Rural  Minor  Arte  rial :   Highways  connecting   with   the 
principal   arterial   system   and  local  system  fall  in 
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this  category.  More  emphasl6  Is  placed  on  land  access 
and  providing  service  to  trips  of  moderate  length. 
The  rural  minor  arterial  road  system,  in  conjunction 
with  the  rural  principal  arterial  system,  forms  a 
network  with  the  following  service  characteristics: 

(a)  Linkage  of  cities,  large  towns,  and  other  traffic 
generators  that  are  capable  of  attracting  travel  over 
long  distances. 

(b)  Integrated  interstate  and  intercounty  service. 

(c)  Internal  spacing  consistent  with  population 
density,  so  that  all  developed  areas  of  the  state  are 
within  reasonable  distances  of  arterial  highways. 

(d)  Corridor  movements  consistent  with  items  (a) 
through  (c),  with  trip  length  and  travel  densities 
greater  than  those  predominantly  served  by  rural 
collector  or  local  systems. 


Minor  arterlals  are  designed  to  provide  for 
relatively  high  travel  speeds  and  minimum 
interference  to  through  movement. 

4.  Rural  Collector:  Roads  penetrating  neighborhoods, 
collecting  traffic  from  local  streets,  and  channeling 
it  to  the  arterial  system.  The  collector  system 
primarily  provides  land  access.  This  type  of  road 
primarily   serves   lntrscounty   travel    and    travel 
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distances  are  shorter  than  on  arterial  routes. 

5.  Rural  Local  Road :  Roads  providing  direct  access  to 
abutting  land.  Through  traffic  usually  does  not  use 
this  type  of  road.  These  local  roads  serve  travel 
over  relatively  short  distances. 

2.9   Chapter  Summary 


The  objective  of  this  chapter  is  to  provide  a  brief 
review  of  the  literature  pertaining  to  rural  traffic 
forecasting.  Definitions  of  the  functional 
classifications  of  rural  highways  have  been  provided  to 
aid  in  classifying  a  highway  for  which  a  traffic  growth 
factor  is  desired.  Procedures  to  estimate  AADT  from 
short-term  traffic  count  have  been  introduced.  The 
commonly-cited  background  factors  for  rural  traffic 
forecasting  have  been  identified  in  this  chapter.  Some  of 
those  factors,  for  which  data  are  available,  are  used  in 
this  research. 
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CHAPTER  3 


PROBLEM  STATEMENT  AND  METHODOLOGY 


The  forecasting  of  traffic  on  rural  highways  has  not 
been  a  major  focus  of  transportation  research.  Most  of  the 
critical  issues  of  this  area  have  already  been  mentioned 
in  Section  1.3  and  in  Chapter  2  (Literature  Review).  In 
this  research,  an  effort  is  made  to  develop  models  to 
predict  future  traffic  on  rural  highways  in  Indiana. 


The  current  practice  at  the  Indiana  Department  of 
Highways  (IDOH)  to  forecast  future  traffic  on  state 
highways  is  based  on  a  pair  of  20-year  growth  factors  for 
each  of  Indiana's  92  counties.  One  growth  factor  in  a 
county  applies  to  its  rural  highways,  the  other  to  its 
urban  sections.  Recognizing  that  the  current  set  of 
traffic  growth  factors  are  outdated,  overly  simplistic, 
and  lacking  the  documentation  necessary  to  update  them, 
the  proposed  method  will  provide  a  means  of  predicting 
future   traffic  volumes  that  is  reliable,  well-documented, 
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flexible,  and  based   on   input   factors   for   which   clear 
relationships  and  usable  forecasts  will  continue  to  exist. 

A  clear  distinction  should  be  made  about  Che  nature 
of  traffic  forecasting  methodologies.  They  are  divided 
into  two  separate  groups:  (1)  Those  that  address  the 
forecasting  problem  as  a  network  analysis,  based  on 
traditional  four-step  process  that  requires  enormous 
amounts  of  data  and  sophisticated  computer  resources, 
while  not  guaranteeing  forecasts  that  are  appreciably 
superior  to  less  detailed  methods,  and  (2)  the  simple, 
easy-to-use  forecasts  on  a  road-to-road  basis  that  fulfill 
Che  particular  needs  of  the  local  highway  departments. 

The  proposed  method  seeks  a  suitable  "middle  ground" 
a  method  that  provides  a  reliable  forecast  with  mode6t 
data  and  computational  requirements.  The  models  developed 
should  be  relatively  simple  to  use  and  could  be  updated 
without  difficulty.  This  study  will  meet  the  continuous 
needs  of  IDOH  for  a  reliable  method  of  estimating  future 
traffic  on  individual  routes  as  an  aid  to  the  planning 
process  and  in  implementing  the  Highway  Improvement 
Program. 

The  study  by  Horf  and  Houska  [36]  leads  Co  Che 
conclusion  that  the  c ha r sc t e r  1  s t lc  "type  of  service"  has  a 
remarkable  effect  on  trsftlc  growth  rates.  Highways  with 
the   grsstest   percentsge   of    lnterurban  or  interregional 
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service  generally  had  the  largest  Increases  In  travel. 
Roads  that  serve  largely  urban-to-rural  or  ru ral -t o-ur ba n 
travel  had  the  smallest  increases.  The  results  of  the  Morf 
and  Houska  study  suggest  different  traffic  forecasting 
models  for  different  functional  classes  of  highway.  The 
functional  classification  of  highway  are  interstates 
(representing  interurban  and  interregional  service), 
principal  arterials  (representing  rural -to-urban  service), 
and  minor  arterials  and  major  collectors  (representing 
rural -to-rural  service).  By  using  functional  class  as  the 
determinant,  the  four  road  types  were  rural  interstate, 
rural  principal  arterial,  rural  minor  arterial,  and  rural 
major  collector.  Statistical  analyses  in  Chapter  5 
suggest  a  different  model  for  each  of  the  four  highway 
categories  and/or  a  separate  model  for  each  station,  as 
opposed  to  one  common  model  for  all  highway  categories. 


A  variety  of  forecasting  models  were  examined.  The 
simplest  one  was  AADT  (Annual  Average  Daily  Traffic)  being 
directly  proportional  to  the  background  factors  of  Table 
3.1,  such  as  population  or  number  of  households.  Table 
3.1  presents  a  summary  of  background  factors  used  in 
developing  the  models.  State  level  data  were  used  only  in 
case  of  Interstate  and  principal  arterial  highways. 
However,  it  was  felt  that  the  explanatory  power  of  such  a 
model  would  be  too  low  to  provide  reasonably  accurate 
forecasts.  Such  a  procedure  also  carries  with  It  a  problem 


Table  3.1 


Background  Factors 
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Level 

Factors 

A.  County 

1.  Population 

2.  Housanoias 

3.  Vanicla  Raa,i»tration» 
a.  Enployncnt 

5.   Income 

B.    Sttt* 

1.  Population 

2.  Hous«nolos 

3.  Vtmcle  nagistrations 

4.  Caploywent 
3.  Ircorw 

C.  Nation*) 

1.  Qasolina  Price 

2.  Consumi  Prioa  Inoax 

3.  Grws  National  Product 
a.  inoowe 

D.  Otrwr 

YtV 
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inherent  In  all  regression  models:  the  problem  of 
forecasting  outside  the  range  of  predictor  variables  In 
which  It  was  calibrated. 

An  elasticity-based  model  [38]  was  finally  selected 
and  used  to  relate  future  year  AADT  to  present  year  AADT 
by  means  of  a  number  of  background  factors.  The  general 
form  of  the  model  is  as  follows: 


AADT,  -  AADT 
f         P 


1 .0  +   I   e  (x     -  x,   )/x 

j.j   J   J  .*     J  .P    J  .P 


(3.1) 


or,  upon  rearrangement, 


AADT,  -  AADT 
t P 

AADT 


n 
I 


x) 

,f 

"  XJ 

»p 

XJ 

.p 

(3.2) 


where  , 


AADT  -  AADT  in  future  year, 

AADT  ■  AADT  in  present  year, 
P 

x  -  value  of  variable  x   in  the  future  year, 
J  » r  J 

x.  -  value  of  variable  x,  in  the  present  year, 
J  .P  J 

e  -  elasticity  of  AADT  with  respect  to  x  , 

n  -  number  of  associated  variables. 


The  elasticity-based  model  was  selected  for  several 
reasons.  The  most  important  reason  was  that  it  was 
believed  that  the  range  of  volumes  over  which  the  model 
would  be  'applied  would  be  much  greater  than  that  used  in 
developing  the  model,  making  a   simple   linear   regression 
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■odel  that  relates  AADT  Co  Che  background  factors  directly 
Inappropriate.  Second,  Che  use  of  preient  year  AADT  Co 
estimate  future  year  AADT  (as  a  sort  of  pivot  point)  would 
reduce  Che  problem  of  nonrealdent  travel.  Alio,  the 
elasticity  portion  of  the  model  calculates  a  growth  factor 
directly  (See  right  hand  side  of  equation  3.2). 

The  AADT  values  were  obtained  from  the  Highway 
Department's  continuous  count  program.  Only  Chose  stations 
classified  as  rural  In  nature  were  aelected  for  use  in  Che 
study.  This  yielded  a  total  of  23  atations  throughout  the 
atate  for  Che  four  categories  of  highways.  Those  stations 
are  shown  in  Figure  3.1.  Based  on  the  county  In  which  the 
automatic  traffic  count  atatlon  is  located,  the  various 
background  factors  (see  Table  3.1)  were  collected. 

The  elasticities  and  the  appropriate  background 
factors  are  derived  from  a  linear  equation  that  relates 
AADT  to  a  variety  of  the  factors  in  Table  3.1.  It  can  be 
shown  mathematically  that,  given  an  equation  of  the  form: 


Y   -  a  ♦ 

1 


D 

I   a  X 
J  1J 


(3.3) 


J-l 


whe  re  , 


value  of  dependent  variable 
at  ith  obaervatlon;  1  ■  1,. 


r 


ij 


value  of  Jth  Independent  variable 
at  1th  obaervatlon;  J  -  1 ,  n  , 


3b 


#  Interstate 

■  Principal  Arterial 
4  Minor  Arterial 

*  Major  Collector 


* ->  y 


Figure  3.1:  Rural  Continuous  Count  Stations, 
State  of  Indiana 
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a    -  constant  term, 

a    ■  regression  coefficient  for 

Jth  Independent  variable, 
n    •  observation  number, 
n    -  number  of  independent  variable 

Elasticity  measures  can  be  estimated  by: 


e   -a 
J     i 


'11 


(3.4) 


where  , 


e    -  elasticity  of  AADT  with  respect  to 

independent  variable  x  , 
X    -  overall  mean  of  the  Jth  independent  variable, 
T    -  overall  mean  value  of  dependent  variable, 
a   as  defined  below  equation  3.3. 


Thus,  using  multiple  linear  regression,  the 
background  factors  that  best  estimate  AADT  and  their 
respective  elasticities  can  be  derived.  The  data  for 
estimation  of  the  background  factor*  and  elasticities  came 
froa  a  variety  of  sources.  Details  regarding  the  data  are 
presented  in  the  next  chapter. 
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CHAPTER  U 


DATA  COLLECTION  AND  DATA  TABLES 


4.1  Introduction 


In  this  chapter,  a  number  of  variables  that  have  been 
identified  in  Table  3.1  will  be  discussed  along  with  the 
traffic  data  for  which  forecasts  are  desired.  The  data 
tables  for  different  highway  categories,  identified  in 
earlier  chapters,  will  also  be  discussed.  These  data 
tables  are  the  input  medium  for  statistical  analysis. 
Some  of  the  earlier  attempts,  which  were  dropped  later  on 
due  to  some  difficulties,  are  described  briefly  in  this 
chapter.  The  main  objective  of  this  chapter  is  to 
describe  the  variables  and  the  evolution  of  the  data 
tables  used  in  the  analysis.  The  sources  of  the  data  and 
their  conversion,  where  needed,  are  discussed  in  detail. 
These  data  tables  could  be  modified  when  new  count 
stations  and/or  new  census  reports  become  available,  in 
order  to  calibrate  and  modify  the  developed  models  to 
predict  future  traffic. 
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The  variables  examined  by  regression  analysis  are 
shown  in  Tab  1 e  4.1. 

4 . 2  Description  o  f  Variables 

Y,  Annual  Average  Da  1 1 y  Traffic  ( AADT) 

AADT  Is  the  average  24-hour  traffic  volume  for  a 
given  year,  for  both  directions  of  travel,  unless 
otherwise  specified.  This  is  the  only  response  variable 
which  needs  to  be  predicted  In  future  years.  The  State  of 
Indiana  has  altogether  23  rural  continuous  traffic  count 
stations  (identified  in  Figure  3.1  of  Chapter  3)  to 
measure  AADT  on  different  functional  classes  of  highway 
[23].  For  this  study,  each  station  has  been  assigned  to 
one  of  the  four  categories  of  highway  identified  in 
Chapter  2.  The  resulting  classification  is  shown  in  Table 
4.2. 


In  the  early  stages  of  this  study,  data  tables  were 
based  on  traffic  data  from  the  1950'6,  1960's  and  1970's. 
In  those  cases,  the  data  for  every  fifth  year  were  taken. 
The  aim  in  these  early  stages  was  to  use  only  basic 
(easily  acquired)  census  data  as  the  independent 
variables.  However,  the  literature  [5,15,37,38]  suggests 
the  use  of  only  recent  data  to  develop  aodel(s)  for 
forecasting. 
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Table  4.1 


Variables  for  Regression  Analysis 


Symbol 

Description  of  the  Variable 

Type 
of  Variable 

Y 

Annual  Average  Daily  Traffic  (AADT) 

xl 

County  Vehicle  Registrations 

Demographic 

*2 

US  Gasoline  Price  in  cents  per 
gallon,  1972  $ 

Economic 

X, 

Year 

County  Population 

A3 

X4 

Demographic 

*3 

County  Households 

Demographic 

X6 

County  Employment 

Economic 

X7 

State  Vehicle  Registrations 

Demographic 

XB 

State  Population 

Demographic 

X9 

State  Households 

Demographic 

X10 

State  Employment 

Economic 

Xll 

Consumer  Price  Index  (CPI)  -  US 

Economic 

X12 

Gross  National  Product  (GNP), 
in  billions  of  1972  dollars 

Economic 

X13 

Per  Capita  Disposable  Personal 
Income  (national),  in  1972  $ 

Economic 

Table  a.  2 

IDOH  (*)  Rural  Continuous  Stations: 
•nd  highway  Category 
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Loca  1 1  on 


Highway  Category 

Count  Station 

Highway  Name 

(County) 

1.  Rural  Interstate 

172A 

I-6S 

Jackson 

3C70A 

1-70 

Hancock 

S-S?4A 

1-74 

Montgomery 

2    Rural  Principal 

66A 

US  50 

Dearborn 

Arterial 

134  A 

US  30 

Allen 

173A 

US  41 

Knox 

2S4B 

US  31 

Marshall 

3.  Rural  Minor 

254 

SP,  9 

Noble 

Arterial 

279A 

US  6 

Elkhart 

301A 

US  42  1 

Ripley 

3  13A 

SR  67 

Morgan 

319A 

SR  56 

Dubois 

424 

US  52 

Tippecanoe 

100X 

US  41 

Lake 

256A 

US  42  1 

Pulas^  i 

262A 

US  2  4 

White 

47A 

SR   1 

Randolph 

4.  Rural  Major 

7047A 

CR  68  (900N) 

Rush 

Collector 

30063a 

CR  63  (E-00E) 

Hancock 

543S2A 

CR  352  (4Q0W) 

Montgomery 

53A 

US  40 

Hancock 

200X 

US  31 

Bartholomew 

5420A 

US  136 

Montgomery 

(•)  IDOH  —  Indiana  Division  of  Highway 
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An  attempt  to  increase  the  number  of  cases  or 
observations  for  each  category  of  highway  was  made.  The 
use  of  annual  data,  as  opposed  to  every  fifth  year  data, 
helped  to  increase  the  number  of  cases.  To  expand  the 
number  of  observations,  the  following  procedure  was  used. 
Traffic  flow  maps  [25]  were  closely  examined  with  the  help 
of  the  1985  functional  classification  system  map  of 
Indiana.  Several  problems  resulted  from  the  use  of  these 
traffic  flow  maps.  First,  the  counts  indicated  on 
different  traffic  flow  maps  were  for  highway  segments 
whose  end  points  would  vary  with  each  edition  of  the  map. 
Second,  the  traffic  estimates  on  the  traffic  flow  map  are 
dependent  on  some  adjustment  factors  derived  from  traffic 
counts  at  continuous  count  stations.  They  are  not  pure 
volumes  taken  under  constant  conditions  but  are  themselves 
estimates.  Finally,  the  traffic  data  on  the  flow  maps 
need  interpolation  to  determine  the  traffic  for  years 
other  those  in  which  the  such  flow  map  data  were 
assembled.  Consequently,  the  idea  of  using  traffic  data 
from  the  traffic  flow  maps  was  dropped  in  favor  of 
interpolated  values.  At  that  point,  the  development  of  a 
prototype  model  took  precedence  over  precise  values  for 
each  year  at  continuous  count  stations.  The  traffic  data 
(AADT)  [23]  used  in  this  study  are  being  taken  from  the  23 
rural  continuous  count  station  for  the  years  1970  to  1984. 


A3 

AADT  data  from  1970  to  1982  were  used  to  develop  the 
data  tablea,  providing  as  many  •■  thirteen  AADT 
observations  per  count  station.  The  traffic  data  for  1983 
and  198<<  were  not  used  In  developing  the  iodtl(i),  but 
were  kept  aside  to  test  the  sodel(s)  to  be  developed. 
Column  1  of  the  data  tables  In  Appendix  A  contains  AADT 
(Y). 


X  ,  County  Vehicle  Registrations  and 
X? ,  State  Vehicle  Registrations 

The  total  number   of   vehicle   registrations   In   the 

county   where  a  count  station  Is  located  (X  )  and  that  for 

the  whole  state  of  Indiana  (X  )  Is  published  each  year   by 

the  Indiana  Bureau  of  Motor  Vehicles  [22].   These  data  are 

reliable  In  the  sense  that  they  are  not  estimates,  but  are 

counts    aade    at    motor   vehicle   registration   offices 

throughout  the  state.   These   variables   are   proposed   to 

explain   AADT   (Y)   on   the   assumption   that   AADT   In   a 

particular  year  at  a  given  place  Is  closely  related  to  the 

number    of   vehicles   registered   then   and   there.    The 

prediction   of   expected   future   traffic   based   on    the 

projection   of  the  trend  of  motor  vehicle  registrations  is 

a   reasonably   accurate   indication   of    future    highway 

traffic.    The  value  of  variable  X   for  each  year,  1970  to 

1982,  was  used  in  the  data  tables  for   all   categories   of 

highways.   The   variable   X.  was  used  only  for  Interstates 

and  principal  arterlals,   because   it   was   believed   that 

state   level  data  influence  those  highways  that  run  across 
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the  state.  Column  3  of  the  data  tables  In  Appendix  A 
represents  X  ,  and  column  9  of  Tables  Al  and  A2  In 
Appendix  A  represents  X   for  the  years  1970  to  1982. 


X  ,  Uj>  Gasoline  Price  in  cents  per  gallon  ,  1972  dollars 

The  variable   X    was   used   for   all   categories   of 

2 

highway,  on  the  assumption  that  the  price  of  gasoline  at 
the  state  and  county  level  parallels  the  national  level 
retail  price.  For  use  in  the  data  tables,  the  prices  (see 
Table  4.3)  were  converted  to  1972  dollars  by  applying  the 
consumer  price  index  (CPI)  for  transportation  [8,17]  in 
equation  4.1. 


1972 


CPI 


1972 


CPI 


19XX 


19XX 


(4.1) 


where , 


1972 


19XX 


CPI 


CPI 


1972 


19XX 


US  retail  motor  gasoline  price  in  cents 
per  gallon,  in  1972  $,  for  the  year  19XX, 
US  retail  motor  gasoline  price  in  cents 
per  gallon,  in  current  $,  for  the  year  19XX, 
Consumer  Price  Index  for  transportation 
for  the  year  1972  (CPI,Q,,  -  100), 
Consumer  Price  Index  for  transportation 
for  the  year  19XX. 
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Table  4.3 


OS  Csaollne  Frlcti  (43] 
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7i  17 

10  Si 

JJ  34 

mn 

44  I) 

ii  77 

M70 
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nn 
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31  17 

mn 

41  7C 

MM 

13  70 
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•I  M 

10  V 

30  3S 
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-  NWm  Mmmw  »  *»»> 

Tt-.t  coovirtid  US  piollni  prices  in  1972  dollars  (*})  *r* 
shown  undtr  coluan  4  in  tht  data  tsblss  of  Appendix  A. 
This  verlabla  Is  adopted  oo  chs  assuaptloo  thac  AADT  Is 
lnvereely  proportlooal  to  ths  price  of  gasoline. 


Xj,  Tear 


The  variable  X  represeote  the  year  la  which  all  Che 
variables,  both  dependent  and  Independent,  apply  for  • 
particular  obaervetlon,  I.e.,  the  row  la  tha  deta  tablee 
of  Appandls  A.  Thle  le  slaply  the  year,  ahown  la  fifth 
column  In  date  teblee  of  Appendix   A   as   1970,   1971 
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1982.  This  variable  was  Introduced  to  reflect  the  time 
effect  on  AADT  (Y)  and  the  X-variablee,  to  atudy  the  resi- 
dual patterns  against  time.  As  a  general  trend,  AADT 
Increases  as  time  passes.  It  la  assumed  that  using  X  as 
a  variable  will  lead  to  high  atatistlcal  correlation  with 
the  other  predictor  variables  (X's)  and,  in  that  event,  X 
could  be  dropped  from  the  models. 


X  ,  County  Populat  ion  and 

X- ,  State  Population 

o   — — ^—  — — — — ^^— — 

Decennial  Bureau  of  Census  records  [9,11]  on  popula- 
tion are  used  for  the  state  and  county  values.  These 
variables  are  taken  as  predictor  variables  on  the  assump- 
tion that  the  response  variable  Y  (AADT)  in  a  particular 
year  at  a  place  is  dependent  on  the  number  of  people  liv- 
ing  there.   The  variable  XB  was  used  only  for  lnterstates 

o 

and  principal  arterials,   because   it   was   believed   that 

state   level  data  Influence  those  highways  that  run  across 

the  state.   Intercensus  estimates  of  X.  and   X.   from   the 

4        8 

Indiana  School  of  Business  [26]  were  used  In  the  data 
tables  for  years  other  than  census  years.  The  Indiana 
Business  School  also  projects  the  population  for  every 
fifth  year  in  the  Intermediate  future.  Its  projections 
•re  made  at  the  county  level,  based  on  the  fertility,  mor- 
tality, and  net  migration  experiences  of  the  county  popu- 
lations. The  state  forecasts  are  the  results  of  the  sum  of 
the  forecasts  of  the  92  individual  counties.   The   projec- 
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tloci  are  bated  on  past  trend*  and  pitterm,  but  also 
Involve  Judgments,  because  simple  historical  txt r apo 1  a 1 1  on 
la  not  always  reliable.  Column  6  of  the  data  tables  In 
Appendix  A  represents  X, ,  and  column  10  of  Tables  Al  and 
A2  In  Appendix  A  represents  X   for  the  years  1970  to  1982. 


X  ,  County  Households,  and 
Xg  ,  State  Households 

A  household  Includes  all  persons  who  occupy  a  housing 
unit.  A  housing  unit  Is  a  house,  an  apartment,  •  group  of 
rooms,  or  a  single  room  occupied  aa  aeparate  living  quar- 
ter* or,  if  vacant,  intended  for  occupancy  aa  aeparate 
living  quarters.  Data  for  total  households  include  all 
occupied  housing  units.  The  number  of  occupied  housing 
units  ia  the  aame  as  the  number  of  households.  The  hous- 
ing statistics  presented  here  for  the  yeara  1970  and  1980 
are  bated  on  the  results  of  the  1970  and  1980  Census  of 
Population  and  Housing,  conducted  by  the  Bureau  of  Census 
as  of  April  1,  1970  and  1980  [9].  Some  of  the  data  col- 
lected by  the  Bureau  of  Cenaua  were  collected  on  a  100 
percent,  or  conple te -count ,  housing  inventory,  while  other 
data  were  obtained  from  aample  estlmatea.  The  samples 
wera  of  5  percent,  15  percent,  and  20  percent,  depending 
on  cha  subject  covered.  The  aample  data  have  been 
"weighted"  or  "inflated"  to  reflect  the  entire  population 
or  universe.  Exact  agreement,  therefore,  la  not  to  be 
expected  between  data  based  on  samples  and  data   resulting 
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from  complete  counts. 

The  total  number  or  households  in  a  county  in  a  par- 
ticular year  Is  X  for  that  year.  X  is  the  statewide 
value.   It  was  found  in  Neveu's  study  [38]  that  number   of 

households  is  a  better  estimate  of  AADT  than  population. 
The  predictor  variables  (X  and  X  )  are  chosen  on  the 
assumption  that  the  response  variable  Y  (AADT)  will  be 
adequately  explained  by  using  them  in  models.  The  vari- 
able X  was  used  only  for  Interstates  and  principal 
arterlals,  because  it  was  believed  that  state  level  data 
influence  highways  that  run  across  the  state.  The  Bureau 
of  Census  [9,11]  gives  the  values  of  X  and  Xg  for  each 
census  year,  while  and  the  Indiana  Business  School  makes 
projections  of  households  for  each  county.  The  state- 
level  projection  is  then  simply  the  sum  of  the  92  county 
values.  The  estimates  of  intercensus  households  between 
1970  and  1984  were  accomplished  by  the  procedure  described 
below. 


Figure  4.1  presents  the  ratio  of  population  and 
households  for  the  past  three  census  years:  I960,  1970, 
1980.  State  and  the  counties  with  Automatic  Traffic 
Record  (ATR)  stations  are  shown  on  the  plots  of  Figure 
4.1.  The  figure  indicates  that  the  slope  for  1970  to  1980 
la  greater;  than  that  for  1960  to  1970.  Although  not  shown 
In  Figure  4.1,  the  slope  in  some  counties  was  positive  in 
the  decade  1950  to  1960.   It  is  assumed  from  the  nature  of 
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Figure    4.1  (continued) 
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the  curve  that  the  average  household  size  (total 
population/total  households)  will  not  change  significantly 
for  the  decade  1980  to  1990  with  respect  to  Its  earlier 
decade.  The  general  trend  will  continue  to  be  one  of 
decreasing  household  size.  The  slopes  of 
population/household  between  1970  and  1980  are  lesB  than 
0.04  per  year  (see  Figure  4.1).  With  these  nild  slopes, 
it  16  assumed  that  the  average  household  size  i6  changing 
uniformly  between  the  census  years  and  that  the  same  rate 
could  be  expected  for  the  next  3  or  4  years  after  a 
census.  Based  on  the  above  assumptions,  households  at 
years  1971  to  1979  and  1981  to  1984  are  computed  by  U6ing 
equation  4.2  for  the  whole  Etate  and  for  counties  with  ATR 
stations  . 


HH 


19XX 


POP/HH1970  ♦ 


POP/HH1980  -  POP/HH19?0 
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where  , 
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Ratio  of  population  to  households 

in  the  year  1970, 

Ratio  of  population  to  households 

in  the  year  1980, 

Population  in  the  year  19XX 

(1971  <  19XX  <  1979  and  19 8  1  ,  . . ,  1  984 ) , 

Households  In  the  year  19IX. 
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The  column  7  of  the  data  tables  In  Appendix  A  represents 
X  ,  and  column  11  of  Tables  Al  and  A2  in  Appendix  A 
represents  X   for  the  years  1970  to  1982. 


X  ,  Count  y  Employment  and 
X  „ ,  State  Employment 

Employment  data  [11,13,24]  i6  an  economic  variable. 
The  County  Employment  Patterns  [24]  are  released  each  sum- 
mer and  provide  "covered  employment"  data  for  each  month, 
each  county,  and  each  employment  category  for  the  previous 
calendar  year.  County  Employment  Patterns  published  prior 
to  1983  do  not  provide  6tate  employment  figures.  Accord- 
ing to  the  1983  edition,  total  "covered  employment"  con- 
sists of  1.  Mining,  2.  Construction,  3.  Manufacturing:  (a) 
Food,  (b)  Textiles,  (c)  Lumber,  Wood  Processing,  (d)  Fur- 
niture, (e)  Paper,  (f)  Printing,  (g)  Chemicals,  (h) 
Petroleum  Products,  (i)  Rubber,  Plastics  (j )  Leather,  (k) 
Stone-Clay-Glass,  (1)  Primary  Metals,  (m)  Fabricated 
Materials,  (n)  Non-electric  Machinery,  (o)  Electric 
Machinery,  (p)  Transportation  Equipment,  (q)  Instruments 
and  (r)  Misc.  Manufacturing,  4.  Transportation,  Communica- 
tion, Public  Utilities,  5.  Wholesale  Trade,  6.  Retail 
Trade,  7.  Finance,  8.  Agriculture  &  Services,  and  9. 
Government.  According  to  the  1976  edition,  covered 
employment  represents  about  85  percent  of  nonagricultural 
wage  and  salary  employment  and  78  percent  of  all  employ- 
ment.  Major  exceptions  to  coverage   of   wage   and   salary 
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employment  are  in  railroads,  Email  nonprofit  institutions, 
churches,  private  households,  and  nost  government  unite  . 
State  hospitals,  schools  of  higher  education,  and  local 
government  utilities  are  covered.  In  addition  to  these 
exceptions,  self-employed  workers  (both  farm  and  non-fare) 
•re  excluded  from  coverage. 


"County  Business  Patterns  -  Indiana"  (10J,  a  publica- 
tion of  the  US  Bureau  of  Census,  furnishes  employment  data 
for  each  year  for  the  week  including  March  12,  and  pro- 
vides such  data  for  the  county  and  state  levels.  This  sum- 
mary of  employment  excludes  government  employees,  railroad 
employees,  self-employed  persons,  etc.  This  publication 
also  provides  Federal  Civilian  Employment  for  the  old- 
March  pay  period  by  county  and  state.  The  "City  and 
County  Data  Book"  [9]  is  another  publication  of  the  Bureau 
of  Census  that  presents  employment  data  by  county  and 
6tate  in  every  tenth  year.  The  employment  figures  in  the 
"City  and  County  Data  Book"  are  prepared  from  household 
surveys,  where  workers  are  counted  according  to  their 
place  of  residence;  whereas  for  "County  Business  Pat- 
terns", they  are  counted  according  to  their  place  of  work. 
There  are  various  reasons  for  differences  in  the  two 
series  of  data:  differences  In  the  reporting  systems  they 
use;  differences  in  the  time  period  to  which  the  reports 
refer;  sampling  variations  in  the  figures  baaed  on  the 
sample  survey  and  differences  In  industrial  classification 
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resulting  from  the  fact  that  the  survey  information  is 
obtained  from  respondents  in  workers'  households,  whereas 
the  County  Business  Patterns  Industrial  classification  is 
based  upon  information  either  from  the  employer  or  admin- 
istrative sources. 

There  exists  little  difference  between  the  numbers  in 
"County  Business  Patterns"  and  in  "County  Employment  Pat- 
terns". These  differences  are  mainly  due  to  the  reporting 
systems  and  the  periods  to  which  the  reports  refer.  The 
average  yearly  employment  numbers  from  the  "County  Employ- 
ment  Patterns"   [24]   were   taken   as  variable  X,,  County 

6 

Employment.  The  state  employment  (X  )  are  taken  by  sum- 
ming two  tables,  IE  and  Appendix,  of  County  Business  Pat- 
terns [10].  Table  IE  [10]  gives  the  number  of  employees 
for  the  week  including  March  12  and  excludes  government 
employees,  railroad  employees,  self-employed  persons,  etc. 
The  Appendix  table  [10]  gives  the  number  of  federal  civi- 
lian employees  in  the  mid-March  pay  period.   The   variable 

X  ,  column  8  of  the  data  tables  in  Appendix  A,  was  used  in 
b 

all  types  of  highways,  but  the  variable  X]n,  column  12  of 
Tables  Al  and  A2  in  Appendix  A,  was  u6ed  only  In  case  of 
Interstates  and  Principal  arterlals. 


X   ,  Consumer  Price  Index  (CPI)  -  £S 


This  Is  an  economic  indicator  at  the  national   level. 
The   data   for  the  consumer  price  Index  [8,17]  are  for  the 
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US  city  average.  The  CP1  data  are  used  In  case  of  Rural 
lnteretates  and  Rural  Principal  Arterlals ,  column  13  of 
Tables  Al  and  A2  in  Appendix  A.  The  CP1  value  at  1967  haE 
been  taken  is  100  and  all  other  years'  data  have  been 
expressed  with  respect  to  thi6  base  year.  The  CP1  values 
represent  economic  comparison  at  different  years  and  it  i6 
believed  that  AADT  at  different  years  are  correlated  with 
thi6  economic  indicator. 

X   ,  Cross  National  Product  (  C  N  P )  ,  jLn  billions 
of  1972  dollars 

These  data  are  measure  of  the  value  of  goods  and  ser- 
vices in  the  nation.  It  is  believed  that  traffic  (espe- 
cially truck  traffic)  on  lntersiat.es  and  principal  arterl- 
als  will   be   explained   by   GNP,   X   .   This  variable  is 

12 

presented  In  column  14  of  Tables  Al  and  A2  in  Appendix  A. 
The  data  for  CNP  in  billions  of  1972  dollars  were  obtained 
from  a  monthly  publication  entitled  "Economic  Indicators" 
[13)  and  are  available  for  each  year. 


x  l  V  Pg  r  Capita  Disposable  Personal  Income  (national  )  , 
In  1972  dollars 

This  is  also  a  national  level  economic  Indicator   and 

Is   used   only   in  the  case  of  Rural  Interstates  and  Rural 

Principal  arterlals,  presented  in  column  15  of   Tables   Al 

Id   Appendix   A.    It  Is  believed  thst  this  national  level 

income  Influences  the  traffic  at  national  highways.    This 

variable   was  used  earlier  in  New  Mexico's  [2]  and  Neveu'a 
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[38]  study.  Disposable  personal  income  represents  the 
Income  after  personal  taxes  and  nontax  expenditures.  The 
data  for  each  year  of  per  capita  disposable  personal 
Income  were  presented  in  "Economic  Indicators"  [13]  both 
in  current  dollars  and  in  1972  dollars. 


The  City  and  County  Data  Book  [9]  publishes  the  per 
capita  personal  income  and  median  family  income  at  the 
state  and  county  level  for  the  year  before  the  census 
years.  An  estimate  is  required  for  other  years.  An 
attempt  to  estimate  incomes  by  graphical  interpolation  vas 
found  to  be  unreliable.  Moreover,  future  values  for 
either  of  these  income  variables  are  difficult  to  fore- 
cast, especially  in  an  economy  that  is  subject  to  rapid 
changes.  Based  on  the  stated  criterion  of  using  indepen- 
dent variables  that  are  easily  available  and  simple  to 
forecast,  the  present  data  tables  for  1970  to  1982  exclude 
any  income  variables  at  the  state  and  county  levels  from 
consideration.  The  national  level  data  are  readily  avail- 
able from  "Economic  Indicators"  [13],  where  the  data  are 
presented  for  each  year.  The  future  value  of  this 
national  level  income  in  1972  dollars  can  be  reasonably 
estimated  by  extrapolating  the  graphical  plot  —  income 
vs.  year  —  of  the  data  from  "Economic  Indicators"  [13]. 
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U . 3  The  bat  a  Tables 

The  four  data  tables  for  aggregate  analysis,  one  for 
each  category  of  highway  Identified  In  Table  A  .  2  ,  are 
presented  In  Appendix  A  as  Tables  Al  through  AA.  Those 
stations  In  a  functional  category  whose  data  were  clearly 
well  out  of  Che  range  of  values  for  most  of  Che  stations 
In  lt6  category  were  not  u6ed  in  the  development  of  an 
aggregate  model  .  Instead,  these  stations  were  "saved"  Co 
test  the  ability  of  an  aggregate  model  to  "predicc"  their 
AADT  values.  The  variables  X?  Co  X  were  used  as  candi- 
date background  factors  only  in  the  case  of  Rural  Inter- 
states  and  Rural  Principal  Arterlals.  The  variables  X    Co 

X   were  candidates  in  all  highway  categories.   Each  row  or 
6 

case  of  Appendix  A  corresponds  to  the  year  given  under 
column  5.  The  tables  of  Appendix  A  are  labeled  in  rows  to 
identify  a  row  or  observation  that  corresponds  to  a 
Automatic  Traffic  Record  (ATR)  count  station. 

The  data  tables  present  all  po66ible  cases  or  obser- 
vations for  ATR  count  stations  in  rural  Indiana  between 
1970  and  1982.  The  resulting  number  of  cises  were  26 
Rural  Interstates,  39  principal  arterlals,  52  minor 
arterial*  and  37  major  collectors  respectively. 


The  data  tables  will  be  analyzed  in  two  ways  —  by 
using  disaggregate  and  aggregate  techniques.  Id  disaggre- 
gate analysis,  each  station  (including  choae  dropped   from 
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the  aggregate  data  tables)  will  be  analyzed  separately. 
Station-  or  loca 1 1  on -Bpe ci f i c  models  for  highways  having 
similar  characteristics  will  be  developed.  In  aggregate 
analysis,  stations  under  a  given  category  of  highway  will 
be  analyzed  as  a  group,  and  a  model  applicable  to  any 
highway  classifiable  within  a  certain  group  will  be  pro- 
posed. The  value  of  each  approach  for  each  highway  type 
will  be  assessed  through  some  trial  forecasts  in  Chapter 
5. 

4.4   Chapter  Summary 


The  central  idea  of  this  chapter  is  to  describe  the 
variables  used  in  model  development.  The  variables  iden- 
tified in  Chapter  3  have  been  discussed  and  the  sources  of 
their  numerical  values  are  given.  Explanations  behind  the 
uses  of  all  the  predictor  variables  (or  independent  vari- 
ables, X'6)  are  given.  The  methods  by  which  certain  data 
are  estimated  or  converted  to  a  form  compatible  with  the 
proposed  model  are  presented.  The  reasons  behind  dropping 
some  variables  from  consideration  have  also  been  briefly 
discussed.  Some  of  the  earlier  attempts  at  data  acquisi- 
tion are  also  presented.  This  chapter  is  a  guide  to  the 
data  tables  appearing  in  Appendix  A. 
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CHAPTER  5 


STATISTICAL  ANALYSIS  AND  MODEL  DEVELOPMENT 


The  statistical  analyses  of  the  variables  identified 
in  Chapters  3  and  4  are  described  in  this  chapter.  First, 
models  to  predict  future  traffic,  based  on  the  data  tables 
of  Appendix  A,  are  developed.  The  performance  of  these 
models  are  then  tested  by  trying  to  predict  the 
observations  that  were  not  included  in  the  development  of 
the  model  . 

5.1  Introduction  to  Statistical  Analysis 

As  was  mentioned  in  Chapter  3,  in  order  to  develop  a 
reasonable  causal  relationship,  a  regression  procedure 
that  fits  a  least  square  estimator  of  AADT  to  the 
background  variables  is  the  basis  for  the  development  of 
the  model.  The  regression  approach  was  selected  because: 
(1)  the  SPSS  package  permits  computation  of  the  elasticity 
of  the   ^dependent   variable   (in   this   case,   AADT)   with 
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respect  to  the  independent  variables,  (2)  it  provides  an 
estimate  of  the  function  regressed  (here,  AADT)  that  could 
also  be  used  for  prediction  purposes  In  the  future,  and 
(3)  regression  allows,  by  means  of  linear  tests  associated 
with  It,  testing  the  significance  of  the  effects  of 
different  variables  (X's)  in  the  equation. 

5 . 2  Preliminary  Analysis 

The  preliminary  statistical  analyses  were  done  to 
Identify  any  possible  relationship  between  dependent  (Y) 
and  independent  variables  (X's)  through  scattergrams  and 
to  check  the  normality  and  homogeneity  of  variance 
assumption  in  the  regression  approach. 

5.2.1  Homogeneity  of  Variance 


The  Statistical  Package  for  the  Social  Sciences 
(SPSS)  one-way  program  (39]  was  used  to  identify  the 
homogeneity  of  variances  of  AADT  between  the  stations  in  a 
category  of  highway  for  an  equal  number  of  observations  in 
each  atatlon  or  group.  The  homogeneity  of  variance  of  the 
AADT  was  checked  using  the  Cochran  and  bartlett-Box  tests 
[3]  by  treating  the  Y'a  for  each  station  as  a  group  for  an 
equal  number  of  observations  In  each  atatlon  or  group. 
The  Burr-Foater  Q-test  [3]  was  pertormed  to  check  the 
homogeneity  of  variance  of  Y'a  at  different  atatlons  for  a 
highway  category  with  an  unequal   number   of   observations 
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among  stations.  The  q  statistic  for  the  Burr-Foster  Q- 
test  for  unequal  sample  sizes  was  calculated  from  equation 
(5.1). 


where  , 


q  - 


v  I  (v  S  ) 
1-1 


Z    v  S 
1-1 


(5.1) 


(v   -  n   -  1  and  1-1 


i 

P 

.2 
'i 

V 


Degree  of  freedom  for  1th  station  or  group 

P), 

Number  of  observation  for  ith  station, 
Number  of  stations  or  groups, 

Sample  variance  for  the  ith  station  or  group, 
Arithmetic  average  of  degrees  of  freedom. 


No  one  has  come  up  with  a  B-level  for  homogeneity 
tests  that  will  Indicate  when  the  experimenter  or 
researcher  should  become  concerned  about  making  a 
transformation.  But  a  set  of  working  rules  that  seem  to  be 
effective  for  the  practitioner  has  been  advanced  [3]: 

1.   If  the  homogeneity  test  Is  accepted  at  the  (5   >   0.01 
level,  transformation  16  not  needed. 


2.   If  the  test  is  rejected   at   the   8   -   0.001   level, 
transformation  is  needed. 


62 

3.  If  the  result  of  the  homogeneity  test  Is  somewhere 
between  6  -  0.01  and  0.001  and  If  there  16  a 
practical  reison  to  transform,  then  Che  usual 
transformation  can  be  done;  otherwise,  It  Is 
recommended  Dot  to  perform  the  transformation. 

4.  If  the  Investigator  has  no  theoretical  knowledge  of 
his  variable,  it  is  recommended  to  use  6  -  0.001  when 
the  distribution  of  Y  seems  to  have  excessively  long 
tails. 


The  test  results  for  homogeneity  of  variance  in 
Table  5.1  show  that  the  highway  categories  1,  2  and  3 
aati6fy  the  homogeneity  of  variance  condition  by  both 
the  Cochran  and  Bartlett-Box  tests.  In  case  of  Rural 
Interstates,  the  8-level  for  the  Cochran  test  was 
found  to  be  greater  than  0.05.  Thus,  Che  homogeneity 
of  variance  for  Rural  Interstates  is  satisfied  for 
Cochran  6-level  of  0.05.  A  6-level  greater  than  0.01 
for  the  Bartlett-Box  test  satisfied  homogeneity  of 
variance  for  Rural  Interstates  st  •  6-level  of  0.01. 
But  the  6-level  for  Che  Bartlett-Box  test  for  Rural 
Minor  Arterial  is  0.001.  Bssed  on  the  regression 
analysis,  a  linear  relationship  between  Y  and  the  X's 
is  feasible.  There  is  no  apparent  practical  or 
theoretical  reaaon  Co  transform  Y.  The  distribution 
of  Y's  at  some  ststlons  is  sparae,  as  Indicated  in 
the  data   tables.    Considering   all   chest   factors, 
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Table  5.1 
Results  of  the  Test6  for  Homogeneity  of  Variance 


ft.  Bartlett-B»x  and  c»chran  Test  (Equal   Staple  Size) 

Moncgeriity 

of 
Variance 

Hi griw ay  Category 

(No.  of  station 
or  group) 

Cochran  c 
[0-level] 

Bartlett-€ox  f 
[B-level] 

Remarks  on  B-level 

Cocr.ran 

Bartlett-Box 

1 .  Rural   Interstate 
(    2   ) 

0.6044 
[0.471] 

0.519 

[0   471] 

13  >  0.05 

e  >>  o.oi 

Checked 

2.  Rural  Principal 
firteriai 
(  3  ) 

0.5666 
[0.063] 

1.949 

[0.143] 

6  >   0.O5 

e  >>  o.oi 

Checked 

S.  Rural  Minor 
Arterial 

(    4) 

0.4971 
[0.023] 

5.384 
[0.001] 

e  >  0.01 

e  =  0.001 

Checked 

B.  B«rr-Foster  Q-Tes 

t  (Uneqial   sample  size) 

Honogenity 

of 

Variance 

Highway  Category 

(No.  of  station 
or  group) 

Calculated  q 

Critical  q 

Remarks  on 
e-level 

6  =  0.01 

6  =  0.001 

4.  Rural  Major 
Collector 
(  3  ) 

0.4938 

0.4827 

0.354? 

B  =    .01  -    .001 

Checked 
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homogeneity  of  variance  for  Rural  Minor  Arterial  was 
accepted  at  a  B-level  of  0.001.  The  Cochran  " C 
values"  were  checked  against  critical  "C  values" 
(Appendix  Table  C.8  [AS])  in  the  fir6t  two  categories 
of  highway  with  a  B-level  of  0.05  and  for  rural  minor 
arterial  with  a  B-level  of  0.01. 

The  Burr-Foster  critical  q-value  [3]  shows  6- 
levels  for  rural  major  collectors  between  0.01  and 
0.001.  So,  the  homogeneity  of  variance  was  accepted 
for  rural  aiajor  collector  at  a  8-level  of  0.001, 
using  the  same  reasons  discussed  above  for  rural 
minor  arterial. 

5.2.2  Normality 


The  normality  of  the  four  data  tables  for  the 
four  highway  categories  of  Appendix  A  was  analyzed  by 
means  of  the  Shapiro- W ilk  test  (39)  for  each  station 
aeparately  and  after  combining  stations  within  a 
highway  category.  Because  of  the  few  stations  in 
each  category  of  highway,  normality  is  not  expected 
when  stations  in  a  highway  category  are  analyzed 
together.  But  for  each  station  aeparately,  normality 
is  an  expected  result.  The  result  of  this  test  Is 
ahown  In  Table  5.2.  In  this  table,  the  small  values 
of  W  with  scalier  B-level  are  significant,  i.e.,  lead 
to  rejection  of  the  normality. 


Table  5.2 


Results  of  the  Te6t  for  Normality 
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Higrwav  Category 

ttation(s) 

HO.    Of 

Ca;er 

Shapiro-Wi Ik 
U 

6-level 

Mornality 

d)  All  staticrs 

2( 

o.«oa 

0.05  -   0.10 

Checked 

i.  Ro^al  Inter- 

state 

(ii)  172A 

13 

0.8909 

0.10  -  0.50 

Checked 

(iii)  3070F) 

13 

0.9:27 

0.10  -  0.50 

Checked 

2.  Rural  Princi- 

(i) All  stations 

39 

0.892: 

<C'.01 

unchecked 

pal  Arterial 

(ii)  68A 

13 

0.9161 

CMC  -   0.50 

Checked 

(iii)  173A 

13 

0.9254 

0.10  -   0.50 

Checked 

Civ)  254E 

IS 

0.9227 

0.10  -  0.50 

Checked 

3.  P.ural  Minor 

(i)  All   stations 

52 

0.8880 

<0.01 

Unchecked 

Arterial 

(ii)    25A 

13 

0.9003 

0.10  -  O.50 

Checked 

(iii)  301A 

13 

0.9753 

>0.50 

Checked 

(IV)    313A 

13 

0.8987 

O.10  -  0.50 

Checked 

(v)   262A 

13 

0.9171 

0.10  -  0.50 

Checked 

4.  Rural  Major 

(i)  All  stations 

37 

0.8051 

<0.01 

Unchecked 

Collector 

(ii)  47A 

11 

C.9H7 

0.10  -  0.50 

Checked 

(iii)  59A 

13 

0.9143 

O.10  -  0.50 

Checked 

(iv)  5420A 

13 

0.0899 

0.10  -  0.50 

Checked 

66 

The  test  results  for  normality  show  that  the  Y'6 
•re  normal  at  a  8-level  greater  than  0.10  within  each 
atatlon  location.  Some  of  the  atations  satisfied  the 
normality  criterion  with  a  8-level  greater  than  0.50. 
But  the  Y'6  of  all  the  atatlons  together  under  a 
highway  category  did  not  6  a  :  1  s  f  y  the  normality 
criterion,  except   Rural   InterEtates   at   a   0-level 

greater   than   0.05.    Different  transformations  (for 

0.5    .  ,0.5 

example:  aquare-root,   log,    {Y     -  [Y     -  Y„      }, 

max      max     i 

etc.)  were  done  on  Y'6  to  satisfy  normality  for  each 
category  of  highway.  But  these  t rint f oria t i one 
failed  to  aatlsfy  normality,  when  the  normality  test 
was  done  on  the  transformed  Y's. 

The  reason  for  nonnormality  within  a  category  of 
highway  Is  the  wide  variation  of  Y's  among  the 
stations.  The  addition  of  stations  would  help  to 
achieve  normality.  In  case  of  Rural  Interstates,  the 
normality  hypothesis  was  accepted  at  a  6-level  of 
0.05. 

5*2.3  Scat  tergram 


Scatterplott  of  the  dependent  variable  (AADT) 
against  the  Independent  variables  are  preaented  In 
Appendix  B  for  the  four  highway  categories  and  In 
Appendix  b  for  the  two  atatlons  —  68A  and  7047A. 
The  acatterplots  were  prepared  with  the  help   of   the 
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Statistical  Package  for  the  Social  Sciences  (SPSS) 
[39]  to  identify  any  apparent  trends  among  the 
variables.  The  plots  in  Appendix  B  show  the  gaps  and 
clusters  among  the  stations.  The  addition  of  new 
count  stations  would  help  to  remove  or  reduce  such 
gaps  and  establish  better  statistical  relationships. 
The  plots  in  Figures  Dl  to  D17  do  not  show  any 
clusters,  but  these  plots  show  a  general  linear 
trend.  Slight  departures  from  the  linear  trend  are 
noticed  in  the  Appendix  D  plots  at  years  beginning 
with  1980.  The  plots  of  AADT  vs.  Gasoline  Price  are 
more  scattered  and  thus  indicate  less  correlation 
between  these  variables.  Although  clusters  are  found 
in  several  plots  in  Appendix  B,  a  good  linear  trend 
is  present  (for  example,  Figures  B1.8,  B3.5,  B4.4 
etc.). 

5.2.4  Conclusions  from  Preliminary  Analysis 


Homogeneity  of  variance  tests,  considering  each 
station  as  a  group,  shows  equal  variances  among  the 
groups  for  each  category  of  highway.  The  normality 
hypothesis  is  accepted  for  each  station  separately 
and  for  Rural  Interstates  as  a  group.  The  reason  for 
normality  of  Y's  for  Rural  Interstates  is 
insignificant  variation  in  Y's  at  its  two  stations. 
The-   main   reason   for   nonnormality   in   the   other 
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categories  of  highways  is  the  wide  variation  or  gap 
in  Y's  among  the  stations,  i.e.,  an  insufficient 
number  of  count  stations  for  each  category  of 
highway.  At  the  same  time,  due  to  fewer  observations 
in  each  category  of  highway,  sampling  of  data  was  not 
done.  The  normality  assumption  is  an  expected  result 
for  sampling  cases  when  such  kind  of  pooling  is  done. 
It  is  apparent  that,  with  the  installation  of  new 
stations  that  will  eliminate  the  gaps  in  Y's,  the  Y's 
will  tend  to  be  normal.  It  is  true  that  the  Y's  are 
not  experimental  and  hence  normality  is  possible  only 
with  counts  of  Y's  between  the  gaps  when  useful 
transformations  on  Y  fail  to  achieve  normality.  The 
normality  test  6hows  that  analysis  for  each  station 
separately  will  yield  a  better  model  than  that  for 
the  combination  of  stations  within  a  highway 
category  . 


It  appears  that  normality  tests  with  the 
available  count  stations  do  not  support  the  idea  of 
combining  the  stations  within  a  category  of  highway. 
But  it  is  also  clear  from  these  analysis  that  the 
normality  of  Y's  for  a  category  of  highways  is 
expected  for  a  larger  number  of  count  stations  and/or 
in  sampling  cases  in  pooled  analysis.  On  the  other 
hand,  each  station  AADT  data  do  confirm  the  normality 
assumption  (See  Table  5.2).   The  scatterplot6  for  the 


69 

stations  —  both  separately  and  together  --  do  not 
show  gross  departures  from  a  linear  relationship  in 
most  of  the  cases  for  demographic  and  economic 
indicators.  No  recognizable  pattern  other  than 
linear  is  noticeable  in  these  plots.  Scatterplots  of 
gasoline  price  and  time  (Appendix  Figures  B1.2,  B2.2, 
B3.2,  B4.2,  and  B1.3,  B2.3,  B3.3,  B4.3)  are  more 
scattered  and  indicate  lower  correlation  with  AADT. 


In  the  next  two  sections  (5.3  and  5. A),  two 
types  of  analyses  will  be  carried  out.  In  Section 
5.3,  aggregate  analysis  combining  all  the  stations 
within  a  category  of  highway  is  employed  to  develop 
an  aggregate  model  for  each  category  of  highway.  In 
Section  5.4,  disaggregate  analysis  of  each  station 
separately  is  performed,  and  the  resulting  models 
will  be  location-specific. 
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5 . 3  Agg  rega t e  Analy si  s 

In  this  section,  models  are  developed  for  each  of  the 
six  categories  of  highway.  In  the  selection  of  variables, 
theoretical  judgments,  together  with  the  results  of 
statistical  analyses,  are  taken  into  consideration.  After 
developing  the  models,  their  performance  is  tested  against 
the  data  for  the  stations  that  were  not  used  in  the 
development  of  model. 

In  aggregate  analysis,  the  stations  were  pooled  under 
a  category  of  highway.  But  the  data  for  stations  clearly 
out  of  the  range  of  values  for  most  of  the  stations  in  its 
category  were  not  used  in  the  development  of  a  model. 
From  a  statistical  standpoint,  it  is  wise  to  restrict 
prediction  to  the  region  of  the  X-space  from  which 
original  data  were  obtained.  In  case  of  this  aggregate 
analysis,  the  X-space  becomes  wide  enough  with  respect  to 
disaggregate  analysis  X-space.  Aggregation  of  data  also 
helps  to  increase  the  number  of  observations  or  cases. 

5.3.1  Multiple  Linear  Regress  ion  Anal y s  i  s 


In  this  section,  the  results  of  some  analyses  are 
presented.  Each  analysis  is  discussed  briefly,  together 
with  some  interpretations  and  criteria  for  selection. 
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5.3.1.1  Correlation  Matrix 

The  statistical  analysis  begins  with  the  study  of  the 
correlation  matrix  for  the  various  factors  considered. 
Table  5.3  shows  the  correlation  matrix  for  the  four 
categories  of  highway.  The  SPSS  [39]  regression  program 
was  used  to  obtain  the  correlation  matrix.  The 
correlation  coefficient  (r)  in  this  table  shows  the 
in t er cor r elat ion  between  the  variables  considered. 

An  important  fact  regarding  this  correlation 
coefficient  is  that,  when  independent  variables  are  highly 
correlated,  the  regression  coefficient  of  any  independent 
variable  depends  on  which  other  independent  variables  are 
included  in  the  model.  In  the  case  of  highly  correlated 
independent  variables,  a  regression  coefficient  does  not 
reflect  any  inherent  effect  of  the  particular  independent 
variable  on  the  dependent  variable,  but  only  a  marginal  or 
partial  effect,  given  whatever  other  correlated 
independent  variables  are  included  in  the  model. 
[15,  19,37] 


In  general,  when  two  independent  variables  are 
correlated  between  each  other,  intercor relation  or 
mult icollinearity  among  them  is  said  to  exist  [37].  The 
three  important  problems  that  arise  when  using  the  highly 
correlated  variables  are: 


(h.  Rural  Interstate 


Table  5.3 

Correlation  Matrix  (*) 
(Aggregate  Analysis) 


v 

X1 

A  2 

X4 

\   C 

X6 

).  7 

X8 

>.9 

X  It         X' 

1         X12 

X1 

573 

X2 

402 

.680 

Y3 

?2e 

.85? 

.827 

)  4 

265 

.912 
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.607 

c 

538 

.984 

.75? 

.890 

.899 

X( 

84? 

9  ^  *t 
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.619 
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.251 

1  - 

825 

.866 
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X8 

en 

.860 
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X9 

7:: 

.85! 
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XI 0 

725 

.739 

.626 

.763 

.49? 

.704 

.625 

.857 

.647 

.781 

X11 

575 

.800 

.665 

.974 

.580 

;- 

.594 

.905 

.911 

.972 

.676 

)  12 

785 

.8-59 

.772 

.972 

.597 

.870 

.657 

.986 

.987 

.979 

.t'i        .9 

>3 

X13 

817 

.861 

.754 

.974 

.606 

.873 

.644 

.993 

.990 

.979 

.657        .9 

4        .994 

Pural  Principal  Arterial 


X1 

1 2 

) ; 

»4 

|    C 

I  • 

- 

X9 

ii.;        i 

- 

XI 

.786 

XI 

.275 

519 

XJ 

.??» 

639 

.827 

X4 

.604 

876 

.2.4 

259 

X5 

.661 

938 

.364 

429 

.974 

X6 

.633 

901 

.542 

692 

.667 

764 

X7 

.405 

64? 

.779 

975 

.247 

411 

.717 

xe 

.402 

642 

.79? 

9?4 

.251 

415 

.721 

.993 

X9 

.401 
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.830 
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.260 
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.702 

.978 

.980 

xio 

.354 
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.t:t 

■ 

.19'. 

31t 

.697 

.85? 

.64? 

.781 

x11 

.  J~s 

602 

.665 

174 

.260 

427 

.(54 

.411 

.m 

.67* 

X12 

.413 

641 

.773 

97; 

.252 

416 

.735 

.98? 

.987 

.979 

.872        .9: 

3 

X13 

.412 

H 

.754 

9->4 

.251 

414 

.993 

.990 

.979 

.85?        .11 

5        .994 
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Table     5.3    (continued) 


fCl    Rural    Minor   Arterial 


Y 

X1 

X2 

X3 

X4 

X5 

XI 

.800 

XI 

.018 

.307 

X3 

.068 

.383 

.827 

X-4 

.907 

.9  5-* 

.  123 

.  149 

X5 

.8  53 

.989 

.241 

.290 

.986 

X6 

.056 

.416 

.514 

.646 

.240 

.345 

(DJ  Rural  Majcr 

Collector 

Y 

X1 

X2 

X3 

X4 

A  J 

X1 

.766 

X2 

.178 

.593 

X3 

.  164 

.587 

.818 

X4 

.915 

.901 

.354 

.341 

X5 

.731 

.954 

.587 

.618 

.921 

X6 

-.453 

.121 

.452 

.  568 

-.  163 

.205 

(*)    For    definition    of    variables,    see    Table    4.1; 
Xi    represents    X    ,    where    i    -    1    to     13. 
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1.  Adding  or  deleting  an  independent  variable  changes 
the  regression  coefficients. 

2.  The  extra  sun  of  squares  of  regression  associated 
with  an  independent  variable  varies  depending  upon 
which  independent  variables  are  already  in  the  model. 

3.  The  estimated  regression  coefficients  individually 
may  not  be  statistically  significant,  even  though  a 
definite  statistical  relationship  exists  between  the 
dependent  variable  and  a  set  of  independent 
variables.  These  problems  can  also  arise  without 
substantial  mul t i col  1 inea r i ty  being  present,  but  only 
under  unusual  circumstances,  not  likely  to  be  found 
in  practice. 

The  existence  of  mul t i col 1 i nea r i t y  does  not  invalidate  a 
regression  analysis,  but  neither  is  the  absence  of 
mul t i co 1 1 i nea r i t y  a  validation  of  a  particular  regression 
model.  Mu 1 t i col 1 inea r i ty  is  also  not  a  specification 
error  [19].  Tne  results  of  the  correlation  coefficients 
will  play  a  role  in  che  selection  of  variables  for  the 
model  under  development. 


5.3.1.2  Stepwise  Regression 

The  stepwise  regression  procedure  is  the  mo6t  widely 
used  automatic  search  method.  It  selects  one  variable  at  a 
time  for  entry  into  the  model,  until  a  desired   6ub6et   of 
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variables  is  selected.  The  stepwise  regression  was 
carried  out  with  the  help  of  the  SPSS  package  [39].  The 
summary  of  that  analysis  is  shown  in  Table  5.4.  The  order 
in  which  variables  entered  into  the  regression  model  does 
not  reflect  their  importance  in  the  model  [37]. 


In  designing  the  regression  statement,  the  four 
associated  parameters  (number  of  steps,  F-value  to  enter 
[FIN],  tolerance,  and  F-value  to  out  [FOUT])  play 
important  roles  in  the  selection  of  variables  for  the 
models.  Three  cases  were  considered  in  using  these 
parameters.  In  case  A,  all  the  parameters  are  default 
parameters.  This  case  will  allow  most  variables  to  enter 
the  regression  equation  and  will  seldom  force  out  a 
variable  during  the  stepwise  procedure.  The  selection  of 
FIN,  FOUT  and  tolerance  level  values  in  cases  B  and  C 
allows  more  control  by  the  analyst  over  variable 
selection.  For  cases  B  and  C,  FIN  and  FOUT  have  been 
computed  using  an  F-table  [37]  of  values 

F( 1-a , 1 , n-p ) ,  where  a  is  the  associated  level  of 
significance,  p  is  the  expected  number  of  terms  in  the 
regression  equation  (a  value  of  3  was  used),  and  n  is  the 
number  of  cases  or  observations.  FOUT  was  kept  less  than 
FIN.  The  calculated  values  of  FIN  and  FOUT  were  shown  in 
column  2  of  Table  5.4.  For  the  parameter  "number  of 
steps",  a  default  parameter  of  twice  the  number  of 
independent   variables  was  used  in  all  three  cases.   Since 


Table  5.4 

Stepwise  Regression  Summary 
(Aggregate  Analysis) 
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Table    5. A    (continued) 
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the  degrees  of  freedom  associated  with  Mean  Squared  Error 
(MSE)  vary,  depending  on  the  number  of  X  variables  in  the 
model,  and  since  repeated  tests  on  the  same  data  are 
undertaken,  fixed  F-limits  for  adding  or  deleting  a 
variable  have  no  precise  probabilistic  meaning  [37J.  MSE 
is  defined  as  Sum  Squared  Error  (SSE)  —  sum  of  squared  of 
deviations  around  the  regression  line  or  plane  --  divided 
by  its  degrees  of  freedom,  n  -  p.  A  minimum  tolerance  of 
0.01  was  used  in  case  B  and  case  C  to  guard  against  the 
entry  of  a  variable  that  is  highly  correlated  with  other  X 

variables  already  in  the  model.   The  tolerance  is   defined 

2  2 

as   1  -  R  ,   where   R    is   the   coefficient   of   multiple 

determination   when   X    is   regressed   on   the   other    X 

variables    in    the    regression   model.    The   tolerance 

specification  of  0.01  provides  that  no  variable  is   to   be 

added   to   the   model   if  it  has  a  coefficient  of  multiple 

determination  with  the  other  X  variables   already   in   the 

model   that  exceeds  1  -  .01  =  0.99  or  that  would  cause  the 

2 
R   for  any  variable  in  the  model  to  exceed  0.99. 


5.3.1.3  C  -statistic  in  All  Possible  Regression 
P 


The  C  -statistic,  R  ,  etc.  for  a  reasonable  number  of 

subsets   of   variables  were  calculated  with  the  help  of  an 

2 
program  "DRRSQU"  [42].  Some  of  those  C   and  R   values   are 

shown  in  Table  5.5. 


Table  5.5 

Selected   C   &  R-Squared  in  All  Possible  Regression 

(Aggregate  Analysis) 
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The  C  -criterion  is  concerned   with   the   total   mean 

squared  error  (MSE)  of  the  n  fitted  values  for  each  of  the 

various  subset  regression  models.   When  the  C   values   for 

P 

all   possible   regression   models   are   plotted  against  P, 

those  models  with  little  bias  will  tend  to  fall   near   the 

line   C   ■  P  [15].   Models  with  substantial  bias  will  tend 

to  fall  considerably  above  this  line.   In   using   the   C  - 

criterion,   the   subsets   of   X  variables  for  which  (1)  C 

value  is  small  and   (2)   the   C    value   is   near   P,   are 

considered   for  the  model.   Sets  of  X  variables  with  small 

C   values  have  a  small  total  mean  squared  error,  and   when 

the   C    value   is  also  near  P,  the  bias  of  the  regression 

model   is   small.    It   may   sometimes   occur    that    the 

regression   model   based  on  the  subset  of  X  variables  with 

the  smallest  C   value  involves  substantial  bias.   In   that 

case,   one   may   at   times   prefer  a  regression  model  on  a 

somewhat  larger  subset  of  X  variables   for   which   the   C 

value   is   slightly   larger,   but  which  does  not  involve  a 

substantial  bias  component.   Thus,  one  should  look   for   a 

regression  with  a  low  C   value  about  equal  to  P.   When  the 

choice  is  not  clear-cut,  then  it  is  a  matter   of   personal 

Judgment   whether   one   prefers   a   biased   equation  or  an 

equation  with  more   parameters.    Draper   and   Smith   [15] 

recommend   the  use  of  the  C„-statlstic  in  conlunction  with 

P  J 

the  stepwise  method  to  choose  the  best  equation.  Some 
statisticians  suggest  that  all  possible  regression  models 
with  a  similar  number  of  X  variables  to  the  number  in   the 
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stepwise  regression  solution  be  fitted  subsequently  to 
investigate  which  subset  of  X  variables  might  be  best 
[37]. 

The  final  selection  of  the  model  variables  will  be 
aided  by  residual  analyses.  Information  gained  by  these 
analyses,  together  with  the  inves t iga t or '6  knowledge  about 
the  phenomenon  under  study,  will  be  helpful  in  choosing 
the  final  regression  model  to  be  employed  [37]. 

5.3.2  Preliminary  Screening  of  Candidate  Variables 


The  screening  of  variables  was  not  confined  to 
statistical  analysis.  Judgment  regarding  the  questions 
listed  in  Table  5.6  was  considered  while  preparing  data 
tables  prior  to  regression  analysis.  No  screening  of 
variables  was  done  at  that  stage,  however.  The  initial 
inclusion  of  a  large  number  of  variables  in  the  models  for 
Rural  Interstates  and  Rural  Principal  Arterials  is 
justified  by  the  fact  that  the  omission  of  essential 
variables  may  produced  biased  estimates  while  the 
inclusion  of  large  number  of  variables  does  not  [19].  The 
basic  questions  in  Table  5.6  will  again  be  reviewed  in  the 
selection  of  the  variables.  The  goals  of  the  analysis 
that  should  be  met  in  this  selection  process  are  shown  in 
Table  5.7.  How  these  goals  are  considered  for  each 
category  of  highway  is  demonstrated  in  the  following 
sections . 
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Table    5.6 
Some     Fundamental     Criteria     for    Variable     Selection     [15,19] 


1. 

Are  the  proposed  variables  fundamental  to  the  problem? 

2. 

availability  of  oata  (variables). 

(a)  Are  annual  data  available? 

(D)  Are  historical  Oata  available'' 

(c)  Vhat  is  the  most  recent  year  of  Oata'' 

(0)  Hill  oata  De  available  in  future? 

3. 

Cost  to  obtain  the  data. 

ft. 

Hov  reliable  is  the  data'' 

Table     5.7 
Goals    of     the    Analysis 


1.  The  final  aquations  should  explain  more  than  SOX  of  the 

2 
variation  (R     >   0.5O). 

2.  The  C   value  will  be  lowest  ana  near  to  P. 

P 

3.  The  nunfeer  of  predictor  variables  snould  be  adequate 
for  each  model  (•). 

ft.  The  selection  will  respond  veil   to  the  questions  of  Table  5.6. 

5.  All  estimated  coefficients  in  the  final  model  should  be 
statistically  significant  at  an  alpha-level  of  0.05  or  0.10. 

6.  There  should  be  no  discernible  patterns  in  the  residuals. 


(*)  As  a  general  rule,    there  should  be  about  ten  complete  sets  of 
observations  for  each  potential  variable  to  be  included 
in  the  model;  e.g..   if  it  is  believed  that  the  final  practical 
predictive  node]  should  have  four  x-variables  plus  a  oonstant, 
tnan  tnara  snould  tw  at   laa&t  forty  sat*  of  oosarvations 
<n  r  40)  [15]. 
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5.3.2.1  Rural  Interstates 


The  correlation  matrix  in   Table   5.3(A)   shows   that 

both    X     and    X    have   moderate   correlation   with   Y 
1  4 


(r 


Y,X, 


=  0.575  and   r, 


*>X, 


0.265),   but   the   correlation 


'1  "'"4 

coefficient    between    these   variables   are   quite   high 

(r,  ,  «=  0.912).    The   variables   X.,   to   X ,  .,   are   highly 
1,4  713 

int er cor related  with  each  other.  Any  one  of  them  —  as 
opposed  to  all  of  them  —  is  eligible  to  explain  Y  and  to 
lessen  mul t icollinear i ty . 

The  case  A  stepwise  regression  with  default 
parameters  includes  almost  all  the  variables,  but  the  sign 
of  b-coefficients  in  the  cases  of  X,  X,  X  and  X  is 
negative,  which  is  contrary  to  the  expected  positive  sign 
indicated  in  scatterplots  (Figures  B1.5,  B1.10,  Bl.ll  and 
B1.12)  for  the  respective  variables.  The  reason  for  this 
unexpected  result  is  the  high  int er cor r elat ion  between 
some  of  the  variables.  The  case  B  and  case  C  stepwise 
regressions   entered   X  ,  X  ,  X   ,  X    and  X     into    the 

equation  with  negative  b-coefficients  X^ ,  X.f  and  X..  with 

2 
R   of   0.984.    The   best   subset   according   to   the   C  - 

criterion    has   too   many   variables.    Furthermore,   the 

correlation  coefficients  among  the  variables  are   in   6ome 

cases  higher  than  0.90. 


Considering  all  the  points  discussed  above,  the   good 
subsets  at  P  =  2,  3  and  4  in  Table  5.5  and  X  ,  X   at  P  =  3 
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with  R   -  0.673  and  X  ,  X   and  X    at  P   «   4   with   R 
0.914  will  be  further  analyzed  to  make  the  final  selection 
from  them. 


5.3.2.2  Rural  Principal  Arterials 


The   correlation   matrix   Table   5.3(B)   shows    that 


X  ,  X  ,  X   and  X, 

14    5        6 


are     highly    correlated    with    Y 


(r  >  0.633).   The   gasoline   price   (X  )   has   the   lowest 

correlation   with  Y  ( r  ■=  0.275).   X  ,  X   and  X   are  highly 

1    4        5 

correlated  among  themselves  (r  >  0.878),  which  argues  for 
the  use  of  only  one  of  these  variables  to  avoid 
mult icol linear ity  in  the  resulting  model.  The  variables 
X  to  X  are  also  highly  in t e r co r r elat ed  and  only  one  of 
these  should  be  selected  to  avoid  mul t i col 1 inear i t y  . 


The  case  A  stepwise  regression  with  default 
parameters  entered  almost  all  variables  with  negative 
signs  in  b-coef f icients  in  X  ,  X  ,  X  ,  X  and  X  .  (See 
Table  5.4).  These  negative  signs  are  contrary  to  the 
expected  positive  signs  indicated  by  the  scatterplots 
(Figures  B2.1,  B2.3,  B2.4,  B2.6,  and  B2.13).  The  reason 
for  these  negatively  signed  b-coe f f i cie nt 6  is  a  high 
degree  of  i n t e r cor r e 1  a t i on  among  some  independent 
variables,  as  shown  in  Table  5.3(B).  So,  the  case  A 
stepwise  regression  choice  will  not  be  further  analyzed  if 
other  choices  in  Table  5.5  avoid  this  problem. 
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The  case  B  and  case  C   stepwise   regressions   entered 

eight    variables    out    of   thirteen   with   negative   b- 

2 
coefficients  X,,  X,,  X^  and  X,,  and  with  R   of  0.986.   The 
14    6        11 

2 
R   of  0.990,  in  the  case  A  stepwise  regression  with  twelve 

variables,  increased  only  a  negligible  amount  with  respect 

to   the  eight  variables  in  the  equation  for  the  case  U  and 

case  C  stepwise  regressions. 

2 
In  choosing  the   C    and   R    values   in   Table   5.5, 

judgment   regarding  questions  of  Table  5.6  and  correlation 

coefficients  values  between  the  variables  were  taken   into 

consideration   because  there  was  a  large  number  of  subsets 

that  could  be  considered.   For  example,  X,,  X,,  and  X ,  are 

1    A         5 

highly  intercorrelated  (r  >  0.878),  and  anyone  from  these 
is   considered,   because   X. ,  X. ,  Xc  and  X,    are    almost 

1       4       D  6 

equally  correlated  with  Y  (r  Z  .800).  The  best  subset 
according  to  the  C  -criterion  has  too  many  variables. 
Considering  all  the  points  discus  sied  above  together  with 
the  goals  of  analysis  of  Table  5.17,  the  good  subsets  of 
variable  sets  at  P  =  2  and  P  =  3  in  Table  5.5  will  be 
further  analyzed  to  make  the  final  selection  from  them. 


5.3.2.3  Rural  Minor  Arterials 


The   correlation  matrix   Table   5.3(C)   shows    that 

X, ,  X.  and  Xr    have  almost   equal   correlation   with   Y 
1    A       5 

(0.800  <    r  ,<  0.907).  The   variables   X  ,  X  ,  and  X    are 

highly  intercorrelated  (r  >  0.954). 
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The   case   A    stepwise    regression    with    default 

2 
parameters  includes  all  variables  with  an  R   of  0.974  (see 

Table  5.4).   However,  an  R   of   0.823   was   obtained   with 

only   X    at   step   1.    The   case   B   and  case  C  stepwise 

2 
regressions  enter  all  the  variables  except  X.  with   an   R 

of    0.970.     The    b-coef f i cien t s   of   X„,XC  and  X,   are 

2   5        6 

negative.  The  negative  coefficient  of  X   (gasoline   price) 
is   an   expected   result.    The   reason   for   the  negative 

coefficients  of  X  ,  and  X   is  its   high   correlation   with 

5         6 

other   variables   in  the  model  (for  example,  r     -  0.989, 

*■  >  3 

r_  ,  -  0.646).   With  X .  and  X,  alone  in  the  equation,   the 
3,6  56 

sign   of   its  b-coef ficient  was  positive.   The  best  subset 

according  to  the   C  -criterion   has   too   many   variables. 

Moreover,   the   subset  with  more  than  one  variable  usually 

has  high  correlation  between  the  variables   (for   example, 

r  ;   -  0.986)  . 
4  ,  5 

Considering  all  the  points  discussed  above  and  the 
criteria  of  Table  5.6,  the  following  subsets  of  variables 
were  kept  for  the  final  selection  process: 

1  .   x5 

2.  X, 

4 

3.  X. 


*'  V  X6 

5.  X5,  Xb 

6.  X2>  X^ 

'  •  A  -  |  A  r 
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o  •    X  ,  |   X  - 
9 .   X  j ,  X^ 

1  U  •         O  *       A  '       S 

1  1  •     i  '    a'    s 

12.   X, ,  X  ,  X 
4    5    6 

2 
All  these  choices  will  provide  an  R   of  at  least  0.641 


5.3.2.4  kural  Major  Collectors 


The   correlation   matrix   Table   5.3(D)   shows    that 

X    X  ,  and  X    have   good  correlation  coefficients  with  Y 
14        5 

of   0.766,   0.915    and    0.866,    respectively.     County 

employment   (X  )   and   AADT  (Y)  are  negatively  correlated, 
6 

which  is  not  the  expected  relationship,  so   the   selection 

of   variable  X   will  not  be  considered  unless  supported  by 
6 

other   analyses.    The   variable    X_    has    the    lowest 

correlation  coefficient  with  Y  (r  =  0.164).   X,,  X.  and  Xc 

14        5 

are  highly  correlated  among  each  other  (r  >    0.731). 


Table  5.4  shows  that  the  case  A   stepwise   regression 

with  default  parameters  includes  all  the  variables  with  an 

2  2 

R   of  0.947.   However,  an  R   of  0.837   was   obtained   with 

only   X    at   step   1.    The   case   B   and  case  C  stepwise 

2 
regressions  select  the  variables  X.  and  X,  with  an   R    of 

4       6 

0.932.    The   inclusion   of   other  variables  in  the  case  A 

2 
stepwise  regression  increased  R   by  only  a   small   amount. 


The   b-coef f icients   of   X,  and  Xc  in  case  A  and  X.  in  all 

15  6 

cases   are   negative.    The   negative   coefficient   of   X 


9U 

(gasoline  price)  is  an  expected  result.  So,  the  case  A 
stepwise  regression  choice  will  not  be  further  analyzed, 
since  other  choices  avoid  the  problems  associated  with  it. 


The  C   values  in  Table  5.5  show  that  the  variable  set 

X.,  X,  and  X.   at   P  =  4  is  the  best  selection,  with  C   of 
3    A        6  P 

2 
2.40  and  R   of  0.944.   But  the  selection  of  X,  and  Xz  at  P 

4        6 

2 
3,   with   Cp  ■=  7.26  and  R   ■=  0.932,   is   the   result  of 

stepwise  regression  in  cases  B  and  C.   The   variable   sets 

{X  ,  X   and  X  }   and   {X  ,  X   and  X  }  at  P  ■=  4,  with  C   of 
14        6  3    4        6  P 

6.25   and   6.76,   respectively,   are   good    for    further 

analysis.    Note  that  X   has  high  correlation  with  X   (r  = 

0.921).   And  X   has  negative   correlation   with   X    (r   = 
4  6 

-0.163),  which  is  not  an  expected  result. 


Considering   the   questions   of   Table   5.6   and   the 
results    of   the   C  -criterion,   correlation   matrix   and 
stepwise  regression,  the  following  subsets  were   kept   for 
final  selection  process: 
1  .   X. 


2.   X 


3.   X, 


5.   X5,  Xfe 


91 


6  .   X  ,  ,  X  _ 

/ .    X. i   X  „ 

2 
These  choices  have  R   of  at  least  0.534. 


5.3.2.5  Summary  of  Preliminary  Screening  Process 


The  R  value,  C  -criterion,  stepwise  regression, 
correlation  coefficients  among  variables,  and  the 
questions  in  Table  5.6  were  taken  into  consideration  in 
the  screening  of  variables  in  the  preliminary  selection 
phase.  The  combination  of  these  criteria,  discussed 
separately  under  each  category  of  highway,  resulted  in 
some  good  subsets  of  variables  from  which  to  make  the 
final  selection.  The  preliminary  screening  reduces  much 
work  in  further  analysis  by  considering  only  the  good 
choices  that  result  from  it.  In  this  screening  process, 
the  first  four  goals  of  Table  5.7  were  taken  into 
consideration.  Subjective  judgment  also  was  made  because 
it  was  not  always  possible  to  meet  all  four  of  those  goals 
at  the  same  time . 
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5.3.3  Final  Selection  of  Variables 

In  the  final  selection,  the  goals  of  the  analysis  in 
Table  5.7  were  considered  together  to  find  the  best  subset 
of  variable(s)  from  the  preliminary  choices  for  each 
highway  category.  Goals  1  to  4  in  Table  5.7  were  taken 
into  consideration  in  preliminary  choices.  Final 
selection  of  candidate  variables  from  preliminary  choices 
was  done  later  through  the  careful  examination  of  all 
criteria  except  the  residual  analysis  and  hypothesis 
testing  concerning  b-coef f icient s .  The  ith  residual, 
denoted  by  e  ,  is  the  difference  between  the  observed 
value  Y  and  the  corresponding  fitted  value  t  (i.e., 
e  ■  Y  -  Y  ).  Residual  analysis  and  testing  concerning 
regression  coefficients  were  carried  out  in  the  final 
selection.  The  final  selection  was  then  used  to  build  the 
model.  The  variables'  coefficients  were  scrutinized  using 
the  following  three  questions  [15]: 

1.   Are  the  coefficients  reasonable? 


The  least  squares  regression  coefficients  are 
adjusted  for  other  variables  in  the  regression. 
Thus,  the  regression  coefficients  may  attempt  to 
predict  the  response  by  changing  only  one  variable, 
using  its  coefficient  to  decide  how  much  to  change 
it.  If  all  the  estimated  coefficients  are 
independently  estimated,  this   may   do   little   harm. 


93 

However,  when  the  predictor  variables  are  highly 
correlated  and  the  estimated  coefficients  are  also 
correlated,  reliance  on  individual  coefficients  can 
be  dangerous.  A  check  can  also  be  made  to  see  if 
individual  coefficients  are  di re c t ionally  correct. 
For  example,  if  X  is  number  of  vehicle  registrations 
and  Y  is  the  AADT ,  then  b  (the  b-coef f icient 
corresponding  to  X  )  should  be  positive.  This 
question  was  examined  by  checking  the  positive  or 
negative  sign  of  coefficient  with  that  of  the 
expected  sign . 

2.  Is  the  equation  plausible? 

Are  the  appropriate  variables  in  the  equation, 
and  are  any  obvious  variables  missing?  This  question 
was  considered  in  the  residual  analysis  on  final 
selection  to  see  if  any  important  variable  was  missed 
and  by  examining  the  first,  third  and  fourth 
questions  in  Table  5.6. 

3.  Is  the  equation  usable? 


The  final  model  will  contain  a  set  of  variables 
that  can  be  used  for  predicting  response  variable(s) 
(in  this  case,  AADT).  This  question  was  considered 
through  the  variable  selection  process  by  considering 
the  second  question  in  Table  5.6  regarding  the  future 
value  of  the  variable. 
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In  the  final  selection  of  varlable(s)  for  the  model's 

2 
equation,   the   criteria   of   establishing  high  R   was  not 

2 
considered   exclusively.    Because   R     is    a    relative 

quantity,   it   indicates   how   large  the  Regression  Sum  of 

Squares  (SSR  --  sum  of  the  squares  of   the   deviations   of 

the   fitted   regression  values  around  mean)  is  relative  to 

the  Total  Sum  of  Squares  (SST  —  sum  of  squares   of   total 

deviations   around   mean),  where  SST  is  fixed  and  does  not 

depend  on  Y.   SST  =  SSR  +  SSE,  where  SSE  is  the  Error   Sum 

of  Squares  or  residual  sum  of  squares  —  sum  of  squares  of 

the  deviations  around  regression  line  or   plane.   In   some 

2 
situations,   data  may  be  quite  variable  and  a  large  R   may 

not   indicate   a   very   good   fit.    In   more    controlled 

2 
situations,   a   relatively   small  R   may  indicate  a  rather 

2 
good  fit  [19].   The  value  of  R   can  only  increase   if   the 

2 
number   of  predictor  variables  increases.  Consequently,  R 

is  always  the  maximum  for  the  full  set  with  all   predictor 

2 
variables.    So   maximizing   R    cannot  really  be  the  sole 

selection  criterion.   However,  one  can  subjectively  choose 

a  subset  of  predictor  variables  that  gives  a  good  value  of 

2 
R  ,  such  that  using   any   additional   predictor   variables 

2 
results   in   only   a   marginal   improvement   in   R  .    The 

residual   patterns   were   always   examined   on   the   final 

selection   to   accept  the  final  selection  for  building  the 

mode  1  . 
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5.3.3.1  Regression  on  Preliminary  Choices  and 

Final  Selection 

Regression  on  the  preliminary  choices  was   done   with 

the   help   of   the   SPSS  package  [39].  The  summary  of  that 

analysis  is  shown  in  Table  5.8  for  all  four  categories   of 

highway.     The   magnitude   of   b-coef f i ci ent s   and   their 

inconsistency  with  reference  to  sign   is   shown   in   Table 

5.8. 

(1)  Rural  Inters  tat es 


Table  5.8  shows  inconsistency  in  the  b-coef f ici en t s 
in  some  of  the  preliminary  choices.  The  more  variables  in 
the  model,  the  more  costly  and  complex  it  becomes  to 
implement  and  maintain.  If  the  model  is  restricted  to 
variables  without  inconsistency  in  their  coefficients  and 
judgment  is  applied  to  the  questions  in  Table  5.6,  then 
the  following  choices  are  eligible  for  the  final 
selection: 

1  .   X, 


2.   X 


8 


3.  X9 

4.  X2,  X? 

5.  X2,  Xg 

6.  X?,  X^ 

Inconsistency  in  regression  coefficients  is  due  to 
multicollinearity .  It  was  mentioned  in  Section  5.3.1.1 
that   this   multicollinearity   does   not   invalidate    the 


Table  5.8 

Multiple  Linear  Regression  Summary 

on  Preliminary  Choices 

(Aggregate  Analysis) 


Migr.-j, 

toriaDle 

t-eo»ffici 

«"iT  m  s*k 

oroer 

BlCW'Sif- 

R-rqusrec 

ouer j 1 1 

Categ  jtji 

$ut>;cr  ipts 
in  Eqn.(») 

ten<:  i*; 
in  t'5  -'  • 

F(»\ 

7 

.883t 

.  68 1 

c1  2,cr 

3 

.ei5t 

— 

.tZi 

16.275 

9 

.8111 

— 

.  5  '.■'■ 

27.7(-r 

RL~-: 

1.7 

-i«..u:s. 

.SOf.ii 

— 

.t:; 

55.71* 

2  • 

-16?. 195 , 

.82f5 

— 

.i2k 

55  8A1 

2,9 

-168.829, 

.8195 

— 

.673 

23. 6U! 

7,9 

6.816;, 

-.62i1 

-t; 

.814 

kc.Sbc 

7.11 

8.8878 

-W.fe'- 

-M1 

.%: 

62.681 

IHTER- 

swn 

1  2  ? 

-.1888  - 

H6.626, 

.e»<:: 

-M 

.981 

tr.?7)- 

2.t.i 

-w.t>'-. 

-.??-.: 

.8331 

-D5 

.??? 

55.288 

2.H.7 

-■     .7U. 

-.1593. 

.  >«6  1 

-tu 

.  896 

61.-597 

E  7  11 

-.S9~9 

8.9982.  -22 

.8*67 

-*5,-M1 

.91* 

77.IP9 

BUM 

5 

.8U5 

.m 

t28.*57 

1 

.%% 



.bkl 

6~.38i 

1 

.318-t 



.61? 

59.788 

pt  bn  i- 

k  7 

.3366 

.8*0v'* 



.6vi 

lt.ltt 

PflL 

».? 

.3911, 

.88UU? 

— 

.e?« 

l?.*1" 

1  - 

.MM. 

.mi 



.6  • 

39.36k 

5.(7) 

.WW, 

8W? 



.779 

63.289 

r-.a> 

.8286. 

■     ■ 



.776 

60.188 

fttTlt  W. 

• 

.688U5 



777 

62.71b 

Table     5.8    (continued) 
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Highuaj 

Uariabie 

b-coeff it i 

ent  in  sate 

order 

Inconsis- 

R-Squared 

0U6T311 

Category 

Subicripti 
in  Eqn.(») 

tencies 
in  b's(") 

F(*«) 

5 

.3157 

.727 

133. 136 

RURAL 

U 

.«32 

— 

.823 

233. 175 

1 

.1157 

— 

.61*1 

89. 123 

niNCit 

M 

.1878, 

-.1182 

-06 

.851 

U6.278 

5,6 

.3583, 

-.1829 

-b6 

.791 

93.828 

ARTERIAL 

(2),U 

-■6.982, 

.18U5 

— 

.832 

121.616 

2,5 

-S3. 582, 

— 

.7(4 

79.1*78 

»,5 

.2875, 

-.»3 

-05 

.855 

188.815 

1.H 

-.1858, 

.1825 

-61 

.871 

165.831 

2,  H,  5 

28.8888. 

.3W2, 

-.8189 

♦b2,-65 

.899 

162.197 

(1>,U,5 

.1888 

.32UU, 

-.95.13 

-bS 

.888 

127.651 

U,5.6 

.2653, 

-.5551,     - 

.3819 

-b5,-b6 

.8*5 

122.886 

RURAL 

ft 

.27-6 

.837 

188.862 

1 

.1981 

— 

.587 

W.693 

IIROOR 

E 

.7122 

— 

.53* 

18.856 

M 

.25W, 

-.1979 

-66 

.932 

233.367 

COLLEC- 

5,6 

.8379, 

-.3*58 

-66 

.913 

177.885 

TOR 

k,2 

.2892, 

-3U.56U1 

— 

.862 

186.838 

5,2 

.9292, 

-78.8612 

— 

.629 

28..  772 

(*)     95  J£  confidence    interual  of  the  b-coefficienUs)  for  the  uariable(s) 

enclosed  in  brevet    incldes  zero. 
(«)     '»■  and  '-'  sign  uitn  b-coefficient(s)  is  inconsistent  uith  the  expected 

result. 
(*«)     oueraii  significanc*=8.w 
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regression  analysis.  The  variable  X3  (year)  has  been 
dropped  because  it  is  believed  that  its  effect  is 
reflected  in  other  variables  and  because  year  as  a 
variable  will  always  increase,  while  AADT  may  decrease 
with   year. 

because  it  is  a  US  city  average  data  and  its  increasing 
pattern  has  no  theoretical  bearing  on  the  observed  upward 
trend  of  AADT. 


X     (consumer  price  index)  has  been  dropped 


The  important  test  statistics  evaluated  earlier  in 
this  chapter  for  the  six  candidates  are  summarized  in 
Table  5.9. 


Table  5.9 


lu^ary  Statistics  of  Choices  for  Final  Selection 

(Rural  Interstate ) 


Choice 

Variable 

Number 

Subscripts 

in  Eauation 

1 

7 

2 

6 

3 

9 

4 

2  7 

5 

2.  8 

6 

7  9 

R 

Inconsis 

Squared 

"tencies 

in  b '  s 

.691 

— 

.658 

— 

.536 



.829 



.824 

— 

.810 



213.7 

230.4 
321.  1 
106.4 
110.2 
120.  7 


These  choices  contain  no  inconsistencies  in  the  regression 

2 
coefficient.    In   Table   5.9,   the   R    and  C   values  for 


choices  1  and  2  are   almost   equal.    The   R    values   for 
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choices   with   two   variable   are  not  higher  than  that  for 

choices  with  one  variable.   So,  choice  2  of  Table  5.9  with 

X    only   is   taken   as   the   final   selection  for  further 
8 

analysis  for  Rural  Interstates. 
(2)   Rural  Principal  Arterials 


All  the  choices  in  Table  5.8  do  not  exhibit  any 
inconsistency  in  regression  coefficients  but  95  percent 
confidence  interval  of  some  regression  coefficients 
includes  zero.  The  choices  with  zero  in  the  95  percent 
confidence  interval  of  regression  coefficients  will  not  be 

considered  for  final  selection.   Choices  with  one  variable 

2 
in  their  equations  have   R    values   of   0.618   to   0.77b. 

There   are  choices  with  2  variables  in  an  equation  without 

2 
the  inconsistency  in  b-coef f icients  and   with   R    greater 

than  the  choices  with  one  variable.   The  following  choices 

emerged  as  candidates  for  the  final  selection: 

1  .   X 

2.  X, 

4 

3.  X 

4.  X4,  X? 
5  •   X  ,  ,  X  _ 

6*   X4»  X9 
Regarding  the  questions  of  Table  5.6,  it  is  apparent   that 

all   the   variables  in  these  final  candidates  are  eligible 

to  build  the  model. 
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The  important  test  statistics  evaluated  earlier  in 
this  chapter  for  the  above  three  candidates  are  summarized 
in  Table  5.10. 


Table  5.10 

Summary  Statistics  of  Choices  for  Final  Selection 
(Rural  Principal  Arterial) 


Choi ce 

\ 

•  an  able 

r-c 

P 

R 

Inconsi? - 

Number 

Si 
in 

jbscript* 

Equation 

Squared 

tencies 
in   b '  s 

1 

5 

559.6 

2 

.776 



2 

4 

903.7 

2 

.647 



3 

1 

981. 4 

2 

.618 



4 

4,  7 

7E5.4 

3 

.692 



5 

4,  8 

791.9 

3 

.69 



6 

4.  9 

801.3 

.686 



The  b-coef f icients  of  X.  and  the  R   values   for   the   last 

A 

three   candidates   are   approximately  the  same  (See  Tables 

5.8  and  5.10).   All  the  choices  of  Table  5.10  will  provide 

somewhat    biased   estimation   with   respect   to   the   C  - 

criterion.   Considering  the  questions  of  Table  5.b,  Xg   is 

preferable   to  X   or  X  ,  because  annual  historical  data  of 

state  population  (X  )  is  available,  but  historical  data  of 

8 

state  household  (X  )  is  computed  based  on  data  on  X  .   So, 

9  o 

the  data  on  X   are  more  reliable  and  less  costly  than  that 
8 

on   XQ  .    Future   data  on  X7  are  not  available.   Thus,  the 
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fifth  choice  in  Table  5.10  (variables  X^  and  Xfe)  is  the 
final  selection  for  further  analysis  for  Rural  Principal 
Arterials  . 


( 3 )  Rural  Minor  Arterials 

Table  5.8  shows  inconsistency  in  the  b-coef f icient  in 
some  of  the  preliminary  choices.  The  95  percent  confi- 
dence interval  of  b-coe f f icient s  of  some  of  the   variables 

2 
includes  zero.   At  the  same  time,  R   in  last  three  choices 

in  Table  5.8  does  not  increase  much  with  respect   to   ear- 
lier choices. 

The  important  statistics  evaluated  earlier  in  this 
chapter  for  the  remaining  four  choices  of  Table  5.8  are 
summarized  in  Table  5.11. 


Table  5.11 

Summary  Statistics  of  Choices  for  Final  Selection 
(Rural  Minor  Arterial) 


Choice 
Number 


Variable 

Subscripts 

in   Equation 


4 

5 

1 

2,  5 


259.8 

2 

428.1 

2 

1578.7 

2 

364.9 

3 

R  Inconsis- 

Squared  "tencies 

in   b '  s 


.823 
.727 
.641 
.764 


102 


The  first  two  choices  of  Table  5.11  are   better   than 

2 
the   other   choices.    The  first  choice  has  the  largest  R 

(0.823)  among  the  four  candidates  in   Table   5.11   but   is 

very   close  to  the  second  choice.   The  variables  X,  and  X 

4       5 

have  future  values  available.  Thus  any  of  the  first  two 
choices  in  Table  5.11  is  equally  good  for  making  the  final 
selection  for  Rural  Minor  Arterials.  The  variable  X,  is 
being  selected  arbitrarily  as  the  final  selection  for 
further  analysis. 


( k )  Rural  Ma j  or  Collectors 


Table  5.8  presents  inconsistency  in  the  b-coe f f i ci en t 

for   X,  for  the  preliminary  choices  {X.,  X,}  and  { X , ,  X.}, 
o  4    6  5    6 

2 
respectively.   The  choice  with  X,  has  the  largest   R    and 

lowest   C    among   all  the  choices  with  one  variable.   The 

choices   with   two   variables   without   inconsistency    in 

regression    coefficients    do    not   provide   significant 

2 
increase  in  R   with  respect  to   the   one-variable   choices 

(see   Table   5.8).    Thus,   the   variable   X   is  the  final 

A 

selection  for  further  analysis  for  Rural  Major  Collectors. 
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5.3.3.2   Graphic  Residual  Analysis  on  Final  Selections 

The  residual  plots  shown  in  Appendix  C  were  generated 
by  the  BhDP  package  [47].  The  plots  were  done  to  check 
the  aptness  of  each  model.  The  ith  residual,  denoted  by 
e  ,  is  the  difference  between  the  observed  value  Y  and 
the  corresponding  fitted  value  7   (i.e.,  e   =  Y   -  ? .  )  • 

Figures  Cl.l  to  CI. 4  are  the  plots  of  residuals 
against  predicted  AADT,  Figures  C2.1  to  C2.4  are  the  plots 
of  residuals  against  the  final  selected  predictor 
variables,  Figures  C3.1  to  C3.6  are  the  normal  probability 
plots  of  residuals  (the  residuals  against  their  expected 
values  under  normality)  and  Figures  C  4  .  1  to  C  4  .  4  are  the 
plots  of  residuals  against  year  for  the  four  categories  of 
highway.  In  Figures  Cl.l  through  C2.4  and  C4.1  through 
C4.3,  the  number  of  points  plotted  at  each  position  is 
printed  . 

The  normal  probability  plots  (Figures  C3.1  to  C3.4) 
fall  reasonably  close  to  straight  lines,  suggesting  that 
the  error  terms  are  approximately  normally  distributed.  A 
slight  departure  is  noticed  in  the  case  of  the  normal 
probability  plot  for  Rural  Minor  Arterials  (Figure  C3.3). 
It  is  believed  [37]  that  this  small  departure  from 
normality  will  not  create  any  serious  problems. 


The  plots  of  residuals  against   the   fitted   response 
variable   and   predictor   variables,  Figures  Cl.l  to  C2.4, 
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indicate  no  ground  for  suspecting  the  appropriateness  of 
the  linearity  of  the  regression  function  or  constancy  of 
the  error  variance.  The  clustering  of  residuals  in  some 
cases  is  the  effect  of  combining  the  stations  in  the 
analysis.  It  is  believed  that  a  greater  number  of  stations 
will  remove  these  clustering  patterns.  There  are  no 
suggestions  in  any  of  these  plots  that  systematic 
deviation  from  the  fitted  response  plane  (in  case  of  more 
than  one  variable  in  the  equation)  or  line  (in  case  of  one 
variable  in  the  equation)  is  present.  The  error  variance 
varies  in  some  of  these  plots  with  the  level  of  ?  and  X's, 
but  this  variation  does  not  exhibit  any  gross  departure. 
This  slight  variation  with  T  and  X's  level  is  the  result 
of  pooling  data  from  stations  in  a  particular  category  of 
highway.  These  residual  plots  against  1  and  X's  do  not 
indicate  the  presence  of  any  outlier.  In  a  residual  plot, 
outliers  are  the  points  that  lie  far  beyond  the  scatter  of 
the  remaining  residuals,  perhaps  4  or  more  standard 
deviations  from  zero  [37]. 


Residual  plots  were  also  generated  against  variables 
not  included  in  the  model,  to  check  whether  some  key 
independent  or  predictor  variables  could  provide  important 
additional  descriptive  and  predictive  power  to  the  model. 
One  such  variable  is  the  Year  (X  )  ,  which  has  not  been 
included  in  any  model.  The  plot6  of  residuals  against  X», 
shown  in-   Figures   C4 .  1   to   C4.4,   do   not   indicate   any 
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correlation   between  the  error  terms  over  time,  since  the 

residuals  are  random  around  the  zero  line.  Thus,   it   is 

confirmed   that   the  appropriate  variables  are  included  in 

the   model   and   no  additional   variable  will    provide 

significant  power  to  the  model. 

5.3.3.3  Testing  Hypothesis  Concerning  Regression 

Coefficients 

The   F-test   for   the   regression   relation   explains 

whether   the   variables   in  the  model  have  any  statistical 

relation  to  the  dependent  variable.   The  hypothesis  is 


H0  =  Sl  "  S2 


=  Vl  =  ° 


H   :  all  3,   (k  =  1  , 

a  k 


P-l)  *  0; 


The  test  statistic  is  given  by  F 


HSR 
MSE 


.   A  sum  of  squares 


divided   by   its  associated  degrees  of  freedom  is  called  a 

Mean   Square   (abbreviated   MS),   Regression   Mean   Square 

SSR 


(denoted   by   MSR)  is 


P  -  1 


and  Error  Mean  Square  (denoted 


S  SE 

by  MSE)  is  r.  The  terms  SSR  and  SSE  have  been   defined 

'  n  -  P 

earlier  in  Section  5.3.3. 


If  F   <  F(l-a,  P-l,  n-P),  then  H   holds  and  indicates 
that    the    variables   in   the   model   do   not   have   any 

statistical  relation  to  the   dependent   variable.    Larger 

* 

values   of  F   lead  to  conclusion  H  .   Table  5.12  shows  the 

a 

result  at  a-levels  of  0.05  and   0.10.    The   test   results 

conclude   the  hypothesis  H   (i.e.,  the  relationships  among 

a 

the  variables  in   the   models   exist)   and   H    cannot   be 

a 


10b 


Table     5.12 


Overall     F-tests     for    Aggregate    Analysis 


Highway 

Category 

Variable 

Subscripts   for 

Full  Model 

* 
F 

(*) 

O 

If    H      true  for 

O  «  0.O5? 

a=  0.10? 

1.  Rural 
Interstate 

f 

49.275 

1.    24 

<  .001 

Yes 

yes 

I.  Rural 
Principal 

firtenal 

4  f 

40.017 

2,    3( 

<  .001 

V«S 

ves 

3.  Rural 

Minor 

■ 

193.196 

1      50 

<.001 

Yer 

res 

4.  Rural 
Major 
Collector 

4 

190.0*2 

1      35 

<.001 

ves 

ves 

(»)     of  y     -  degrees  of  freedon  for  Regression, 
df E     =  degree:   of   freedom  for  Error. 
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rejected   at   an   a-level   of   as  low  as  0.05.   Hence,  the 
regression  relationships  listed  in  Table  5.12  exist. 

To    test    the    significance    of    each    variable 

(H   :  B   •=  0 :  H   :  g  *    0  for  1  <  k  <  P-l)  and  each  subset 
Ok        a     k 

with   more   than   one   variable    (H   :  8   =  ...  =6.  =  0; 

H   :  all  8   *  0  for  1  <  j  <  P-l),   a   general   linear  test 
a         J 

[37]  was  employed.  The  applicable  F-statistic  is  shown   in 
equation  5.2. 


SSE  (R)  -  SSE  (F) 

df   -  df 
R F 

SSE  (F) 

— a~f 


(5.2) 


where  , 

F  =  F  statistic  , 

SSE  (R)  =  Error  Sum  of  Squares  for  the  Reduced  model, 

SSE  (F)  =  Error  Sum  of  Squares  for  the  Full  model, 

df  =  degrees  of  freedom  of  the  Reduced  model,  and 
K 

df  =  degrees  of  freedom  of  the  Full  model. 


The  reduced  model  was  obtained  by  dropping  the  element(s) 
to  be  tested  from  the  full  model  under  H  .  Table  5.13 
shows  the  summary  of  the  results  obtained  at  a-levels  of 
0.05  and  0.10.  The  test  results  show  that  when  variables 
are  dropped  from  the  model,  there  still  exist  regression 
relationships.  The  hypothesis  H  cannot  be  rejected  at  an 
a-level  as  low  as  0.05  and  each  variable  in  the  model   has 


Table     5.13 


Partial     F-tests    for    Aggregate    Analysii 


Highway 
Category 

variable 

Subscripts 
for 

* 

F 

or 

Is  Hfl  true  for 

O  =.05? 

a=-io7 

Full 
Model 

Reduced 
Model 

1.  Rural 

Interstate 

Z.  Rural 
Principal 
Arterial 

4  e 

4 

c 

0 

37,    36 
37,    36 

4.963 
61.223 

.025-. 05 

..001 

Yes 
Ves 

Yes 
Yes 

3.  Rural 
ninor 
flrttaai 

i*4. 

4.  Rural 
Major 
Collector 

(*)  df_    =  degrees  of  freedon  for  SSE  for  Reduced  Model   and 

flfp    =  degrees  of  freedom  for  SSE  for  Full  Model. 
(**)  It  has  only  one    variable  in  Full  Model. 
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a  significant  influence  at  a  level  of  significance  0.05. 

5. 3. A  Model  Development  and  Performance 

The  final  regression  equations  are  presented  in  Table 

2 
5.14,   along   with   the   R    values,   overall  F  values,  t- 

statistics  and  elasticities.   The   elasticities   shown   in 

this   table   were   obtained   from   the   output  of  Multiple 

Linear  Regression  on   final   selected   variables   computed 

according   to   equation   3.4   (Chapter   3).    Not   all  the 

conditions  of  Table  5.7  have  been  met  in  all  equations   of 

Table   5.14.  However,  the  equations  that  resulted  from  the 

specified  criteria  of  Table  5.7   are   the   best   possible, 

considering   all   the  limitations.   The  equations  in  Table 

5.14   use   variables   that   are   believed   to   be    easily 

available   from   a   variety  of  sources  for  both  historical 

and  future  trends.   Each  of  the  variables   is   significant 

at   the   95   percent   confidence   level.   The  equation  for 

2 
Rural  Interstates  has   the   lowest   R    (0.658)   and   thus 

explains   only   65.8   percent   of  the  total  variability  of 

AADT  by  the  use  of  variable,  XQ.   The  equations  for   rural 

o 

principal  arterials,  rural  minor  arterials  and  rural  major 
collectors  explain  69.0,  72.7  and  83.7  percent  variation 
in  AADT,  respectively,  by  the  use  of  their  included  X- 
variable ( s ) . 


Using  the  elasticities  obtained  from   the   regression 
analysis,   the   forecasting   model   was  developed  for  each 
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Table     5.14 
Final     Regression     Equations     from    Aggregate    Analysis     (*) 


1.  Rural    Interstate: 

PfiDT  =  -65569.684  +  0.015369  itat*  Population 


2  t  =   t.£02E9 

f.   =  0.656 

h  -   4.S!o14 
F   =    46.275 


2.  Rural    Principal    Rrterial: 

RfCT  =  -27fi9y . £3o  +  0.339113  Courity  Population  +  0.0044:9  cute  Population 

L. 

p.    =   0.69(  r.   =   7.8244?  t  =   2.22777 

p  .  40.0"?  e  =  i.476?9  e  =  2.79623 

3.  Rural    Min»r    Rrterial: 

rrdt  =  6^9.722  +  0.315*9:  county  Household 
P.'    =    .727  t  =  11.53848 

F   =    133.136  e  =   0.S3377 

4.    Rural    Major    Collector: 

RflDT  =  -7046.270  +  0.271510  County  Population 

2  t  =  13.41872 

R   =  O.S.>7 

«  x  3.77374 
F  =  180.062 


(*1  For  un:t  and  synt-:!  o<  li      wllDK    Ml  TaMe  4.1  of  Cnapter  4. 
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category  of  highway  by  substituting  those  elasticities 
into  equation  3.1  (Chapter  3).  The  models  are  presented 
in  Table  5.15.  These  models  generally  satisfy  all  the 
criteria  specified  earlier.  Each  of  the  models  is 
relatively  simple,  containing  not  more  than  two  variables. 
The  use  of  these  models  is  also  straightforward.  The 
input  values  are  the  present  year  AADT  and  the  present  and 
future  year  value  (the  year  for  which  the  traffic  forecast 
is  needed)  of  the  predictor  variables.  The  data  needed  to 
predict  rural  traffic  volumes  with  these  models  are 
readily  available  at  the  county,  state  levels.  The  models 
are  easily  used  by  anyone  with  a  hand-held  calculator;  no 
large  computer  system  is  necessary. 

The  performance  of  the  models  in  Table  5.15  were 
tested  using  data  for  those  Automatic  Traffic  Record  (ATR) 
stations  not  used  in  building  the  models.  In  making  these 
trial  "predictions",  1970  data  were  used  as  "present  year" 
data.  Using  the  appropriate  historical  values  of  the 
predictor  variables,  forecasts  of  AADT  for  the  stations 
not  used  in  model  building  based  on  1970  AADT  were 
computed  and  compared  with  the  actual  values  of  AADT.  The 
results  of  the  trial  forecasts  of  the  models,  shown  in 
Table  5.16,  indicate  that  the  models  perform 
satisfactorily.  The  forecasted  errors  are  reasonably 
small  in  most  of  the  cases  and  speak  well  for  reliability 
of  the  models.   The  larger  forecast  errors  in   some   cases 


Table  5.15 
Aggregate  Traffic  Forecasting  Models  (*) 


1  12 


1.     Rural    In-terstaie: 


flH"r    =  flflDT    [1  +  4.83014  (L  State  Population)] 
f  P  " 


2.  Rural  Principal  Arterial: 


fw:,T    =  flflDf    [*■  +  1.4760S  (A  County  Population  +  2.79623  (A  State  Population)] 
f  p 


3.  Rural  Minor  Arterial: 


ftflDT    =  RAT.T    [i  *  ?..r\~~  (4  County  Households)] 

f  P 


4.  Rural  Major  Collector: 


RfiDT    =  flflOT    [i  *  $.77379    i.  Count}  Populatior   ] 

t  P 


(*)  (i)  For  unit  ana  synt^..:  of  each  variabl*  ;*-.  Table  4.1  of  Chapter  4. 

ini  L  represents  change  in  predictor  variable  »itn  respect  to  its  present  value  in  fraction. 

_P   where  X  p  and  X  f  denote  present  and  future  value.'  of 


N        .     -     t 


For  example    ax 


Ji  J 


Table  5.16 


Performance  of  Aggregate  Traffic  Forecasting  Model 


(1)  Rural  Interstate 


Traffic  Count 

Base 

Forecasted  AADT 

Actual  AADT 

Forecast  error 

Station 

Year 

Year 

(AADTf) 

(AADT  ) 
a 

in  percent  (*) 

1971 

5664 

5627 

0.66 

1972 

5894 

6220 

-5.24 

1973 

6060 

6888 

-12.02 

1974 

6165 

6556 

-5.96 

1975 

6170 

6917 

-10.80 

5474A 

1970 

1976 

6276 

7448 

-15.74 

1977 

6441 

7465 

-13.72 

1978 

6647 

7523 

-11.64 

1979 

6792 

7295 

-6.90 

1980 

6868 

6921 

-0.64 

1981 

6862 

6748 

1.69 

1982 

6827 

6745 

1.22 

*  "+"  sign  indicates  overprediction  and 


"-"  sign  indicates  underprediction. 
Forecasted  error  in  percent 


AADT   -  AADT 
f       a 


AADT 


x  100. 
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Table  5. 16(continued) 
(2)  Rural  Principal  Arterial: 


Traffic  Count 

Base 

Forecasted  AADT 

Actual  AADT 

Forecast  error 

Station 

Year 

Year 

(AADTf) 

(AADT  ) 
a 

in  percent  (*) 

1971 

10846 

10988 

-1.29 

1972 

11242 

11545 

-2.62 

1973 

11480 

12515 

-8.27 

1974 

11661 

11692 

-0.27 

1975 

11623 

11433 

1.66 

173A 

1970 

1976 

11685 

12396 

-5.74 

1977 

11917 

12872 

-7.42 

1978 

12335 

13065 

-5.59 

1979 

12599 

12391 

1.68 

1980 

12690 

11486 

10.48 

1981 

12584 

11809 

6.56 

1982 

12442 

11607 

7.19 

*   "+"  sign  indicates  overprediction  and 
"-"  sign  indicates  underprediction. 


Forecasted  error  in  percent 


AADT,  -  AADT 
f a 

AADT 
a 


x  100. 


Table    5. 16(contlnued) 
(3)    Rural    Minor   Arterial: 
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Traffic 

Forecasted 

Forecast 

Count 

Base 

AADT 

Actual  AADT 

error  in 

Station 

Year 

Year 

(AADTf) 

(AADT  ) 
a 

AADT, -AADT 
f     a 

percent  (*) 

1971 

4825 

4848 

-23 

-0.47 

1972 

4955 

4946 

9 

0.18 

1973 

5092 

4983 

109 

2.19 

1974 

5150 

4612 

538 

11.67 

1975 

5169 

4644 

525 

11.30 

279A 

1970 

1976 

5219 

4988 

231 

4.63 

1977 

5349 

4893 

456 

9.32 

1978 

5471 

5225 

246 

4.71 

1979 

5572 

5038 

534 

10.60 

1980 

5656 

4591 

1065 

23.19 

1981 

5686 

4338 

1348 

31.07 

1982 

5772 

4419 

1353 

30.62 

1970 

1760 

1566 

194 

12.38 

1971 

1831 

1600 

231 

14.44 

1972 

1861 

1652 

209 

12.65 

1973 

1891 

2086 

-195 

-9.35 

1974 

1923 

1720 

203 

11.80 

1975 

1961 

1905 

56 

2.94 

319A 

1980 

1976 

2004 

1947 

57 

2.93 

1977 

2039 

2066 

-27 

-1.31 

1978 

2080 

2214 

-134 

-6.05 

1979 

2133 

2324 

-191 

-8.22 

1981 

2211 

2068 

143 

6.91 

1982 

2241 

2047 

194 

9.48 

1972 

3804 

3956 

-152 

-3.84 

1973 

3816 

3829 

-13 

-0.34 

1974 

3915 

3939 

-24 

-0.60 

1975 

3950 

4196 

-246 

-5.86 

42A 

1980 

1976 

4041 

4546 

-505 

-11.11 

1977 

4127 

4665 

-538 

-11.53 

1978 

4211 

4327 

-116 

-2.68 

1979 

4301 

4360 

-59 

-1.35 

1981 

4422 

4529 

-107 

-2.36 

1982 

4555 

4432 

123 

2.78 
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Table  5.16  (continued) 
(3)  Rural  Minor  Arterial (Cont'd) : 


Traffic 

Forecasted 

Forecast 

Count 

Base 

AADT 

Actual  AADT 

error  in 

Station 

Year 

Year 

(AADT{) 

(AADT  ) 
a 

AADT, -AADT 
f     a 

percent  (*) 

1971 

8464 

8251 

213 

2.58 

1972 

8502 

7945 

557 

7.01 

1973 

8602 

8402 

712 

5.07 

1974 

8648 

8187 

461 

5.63 

1975 

8696 

8075 

621 

7.69 

100X 

1980 

1976 

8784 

8611 

173 

2.01 

1977 

8817 

8924 

-107 

1.20 

1978 

8880 

9454 

-574 

-6.07 

1979 

8961 

9389 

-428 

-4.56 

1981 

9005 

9022 

-17 

-0.19 

1982 

9041 

8837 

204 

2.31 

1971 

2652 

2738 

-86 

-3.14 

1972 

2729 

2710 

19 

0.70 

1973 

2734 

2714 

19 

0.70 

1974 

2796 

2524 

272 

10.78 

1975 

2840 

2709 

131 

4.84 

256A 

1970 

1976 

2884 

2771 

113 

4.08 

1977 

2874 

2827 

47 

1.66 

1978 

2957 

2940 

17 

0.58 

1979 

2948 

2913 

35 

1.20 

1980 

3007 

2861 

146 

5.10 

1981 

3045 

2925 

120 

4.10 

1982 

3055 

2900 

155 

5.34 

*  "+"  sign  indicates  overprediction  and 
"-"  sign  indicates  underprediction. 


Forecasted  error  in  percent  - 


AADT,  -  AADT 
f a 

AADT 


x  100. 
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Table  5.  16(continued ) : 
(4)  Rural  Major  Collector: 


Traffic  Count 

Base 

Forecasted  AADT 

Actual  AADT 

Station 

Year 

Year 

(AADTf) 

(AADT  ) 
a 

AADT., -AADT 
f     a 

1971 

266 

257 

9 

1972 

296 

227 

69 

1973 

292 

233 

59 

1974 

281 

226 

55 

1975 

271 

225 

46 

7047A 

1970 

1976 

271 

231 

40 

1977 

271 

204 

67 

1978 

271 

224 

47 

1979 

241 

294 

-53 

1980 

236 

299 

-63 

1981 

226 

288 

-62 

1982 

205 

272 

-67 

1979 

752 

877 

-125 

30063A 

1980 

1981 

800 

824 

-24 

1982 

793 

767 

26 

1979 

1062 

1159 

-97 

54382A 

1980 

1981 

984 

973 

11 

1982 

1029 

878 

151 

1973 

6547 

8805 

-2258 

1974 

6823 

8834 

-2011 

1975 

7155 

9002 

-1847 

1976 

7431 

9033 

-1602 

200X 

1980 

1977 

8038 

9079 

-1041 

1978 

8535 

9457 

-922 

1979 

8977 

9636 

-659 

1981 

9197 

9226 

-29 

1982 

9308 

9004 

304 

'+"  sign  indicates  overprediction  and 
'-"  sign  indicates  underprediction. 


Forecasted  error  in  percent  = 


AADT,,  -  AADT 
f a 

AADT 


x  100. 
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are  due  to  fewer  cases  and  large  variations  in  response 
and  predictor  variables  employed  in  data  tables  among  the 
stations  and  counties. 


It  must  be  kept  in  mind  that  the  end  use  for  the 
forecasted  volumes  is  the  design  and  planning  of  rural 
highway  projects.  These  volumes  are  generally  low  enough 
so  that  larger  prediction  errors  (on  the  order  of  20%  to 
50%)  will  not  cause  a  significant  change  in  the  design 
criteria.  If  more  years  of  data  had  been  available,  a 
better  comparison  of  forecasting  models  with 
extrapolations  might  have  been  possible.  However,  this 
exercise  prepares  us  for  another  comparison  --  aggregate 
vs.  disaggregate  models  —  to  be  conducted  indirectly 
later  in  this  chapter. 
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5 . A  Disaggregate  Anal y si  s 

In  this  section,  each  station  has  been  analyzed 
separately  and  a  separate  forecasting  model  has  been 
developed  for  each.  The  criteria  for  variable  selection 
are  the  same  as  that  in  the  aggregate  analysis. 
Performance  of  the  models  has  been  tested  with  1983  and 
1984  data,  which  were  not  used  in  the  development  of  the 
models.  In  disaggregate  analysis,  the  number  of 
observations  on  which  to  base  each  station's  model  is  much 
smaller  than  in  aggregate  analysis,  where  some  of  the 
stations'  observations  were  combined  under  a  highway 
category.  Furthermore,  the  range  in  X-variable  values  is 
smaller.  The  key  issue  here  is  whether  the  added 
consistency  in  using  data  from  a  single  station  will  be 
enough  to  offset  the  reduced  amount  and  range  of  data 
values  . 

No  attempt  was  made  to  develop  disaggregate  model  for 
stations  30063A  and  54382A  under  Rural  Major  Collectors, 
since  only  four  observations  of  AADT  were  available  for 
each  of  these  stations.  Also  no  disaggregate  model  was 
developed  for  stations  313A  and  47A.  For  these  two 
stations,  the  AADT  values  were  found  almost  constant  over 
the  period  of  analysis,  which  was  not  the  case  with  the 
predictor  variables.  Complexity  of  statistical  analysis 
arises  as  the  number  of  variables  increases  and  the  number 
of   observations  decreases.   To  avoid  this  complexity,  the 
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variables  X..  (US  Consumer  Price  Index)  and  X  „  (Gross 
National  Product)  were  dropped  from  the  data  tables  for 
Rural  Interstates  and  Rural  Principal  Arterials.  These 
variables  were  dropped  here  because  they  had  failed  to 
survive  during  the  variable  selection  process  in  the 
aggregate  analysis.  The  variable  X  (Year)  has  been  kept 
in  the  data  tables  to  6tudy  the  residual   pattern   against 

X3* 

The  analysis  starts  with  the  study  for  scatterplots 
of  AADT  (Y)  against  X's  at  each  station.  The  scatterplots 
were  done  with  the  help  of  SPSS  [38]  to  identify  any 
apparent  trends  of  Y  with  X's.  In  general,  scatterplots 
of  all  stations  show  a  linear  trend,  except  for  stations 
47A,  262A,  279A,  3 1 3A  and  7047A,  which  are  more  scattered. 
Two  representative  plots  of  stations  68A  and  70A7A  are 
presented  in  Appendix  D.  Plots  of  AADT  against  Gas  Price, 
as  shown  in  Figures  D2  and  D13  in  Appendix  D,  were  found 
to  be  very  scattered,  which  indicates  that  gas  price  is 
less  effective  to  predict  AADT  than  other  predictor 
variables.  A  slight  decrease  in  AADT  from  its  increasing 
trend  i6  noticed  in  the  scatterplots  at  years  after  198U. 
A  similar  decrease  was  al6o  observed  in  some  X's  (for 
example,  when  X  ,  X   and  X?  are  plotted  against  year). 
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5.4.1  Multiple  Linear  Regression  Analysis 

The  same  kind  of  analyses  have  been  done  in  this 
section  for  each  station  as  were  done  in  aggregate 
analysis  for  each  highway  category.  The  interpretation 
and  selection  criteria  presented  during  aggregate  analysis 
are  also  applicable  in  disaggregate  analysis. 

The  multiple  linear  regression  analysis  starts  with 
the  study  of  the  correlation  matrix.  The  SPSS  [39] 
regression  program  was  used  to  obtain  the  correlation 
matrix.  Table  5.17  6hows  the  correlation  coefficients  for 
the  stations  under  analysis.  In  general,  Table  5.17  shows 
that  the  independent  variables  (X's)  are  highly  correlated 
among  themselves.  The  Year  (X_)  has  low,  moderate  and 
high  correlation  (for  example,  station  262A:  r  =  0.011, 
station  134A:  r  =  0.429,  station  173A:  r  =  0.973)  with 
AADT  (Y).  In  general,  most  of  the  independent  or 
predictor  variables  (X's)  have  high  correlation  with  AADT 
(Y),  except  for  stations  279A  and  262A  (Rural  Minor 
Arterials),  47A  and  7047A  (Rural  Major  Collectors).  But, 
there  is  low  correlation  and,  for  some  stations,  negative 
correlation  of  X's  with  Y  (for  example,  stations:  262A, 
279A  and  7047A).  The  reason  for  this  low  and/or  negative 
correlation  is  reflected  in  Tables  A3  and  A4  and  in  the 
scatter  plots  of  Figures  D12  to  D17.  The  AADT  (Y)  for  the 
above  stations  remained  almost  unchanged  and,  in  some 
cases,   decreased  during  the  years  1970  to  1982.   However, 
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Table  5.17  (continued) 

C.  RURAL  MINOR  ARTERIAL 
(i)  Correlation  Coefficients  for  Station  25A 


II 

8363B 

i2 

36205 

75395 

i3 

6952B 

9c  122 

82725 

l4 

.  79990 

. 9B676 

.  B0572 

. 96315 

«5 

7-4007 

.  97990 

. B3026 

. 99229 

.  9BS47 

16 

79281 

B4551 

53144 

.  74606 

62204 

y 

il 

i2 

i3 

i4 

79053 


(ii)  Correlation  Coefficients  for  Station  279A 


1 1 

--  14435 

i2 

-  £=7354 

- 71B43 

•  3 

-  3o449 

. 95979 

B2725 

i4 

-  23464 

.  9B033 

78397 

94552 

i5 

-.  34090 

97390 

.  B2S93 

.  99408 

97330 

16 

.  34805 

. B414? 

.  27336 

.  67956 

.  78502 

. 71510 

V 

il 

i2 

i3 

l4 

iS 

(iii)  Correlation  Coefficients  for  Station  301A 


(iv) 


il      7B648 

i2     .  37626 

.  B0773 

i3      72152 

. 99011 

.  B2725 

1 4      81044 

96363 

.  76825 

96921 

iZ            . 76S21 

. 99430 

.  80615 

.  99334 

99098 

1 6     .80360 

. 96084 

.  7797S 

. 94614 

.  9261 B 

.  94400 

<i 

C 

il         i2 

i3 

i4 

15 

elation 

oefficients 

for 

Station 

319^ 

II 


85658 


■  2 

. 53870 

.  B1982 

i3 

. 79787 

.  98915 

.  B2725 

l4 

. 82157 

.  98276 

.  B2266 

.  9B15B 

l5 

.  79497 

.  9B799 

.  83754 

.  99778 

98909 

16 

B8963 

. 97611 

.  73926 

.  94305 

. 9541B 

II 


i3 


i4 


94343 


i5 
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Table  5.17  (continued) 


(v)  Correlation  Coefficients  for  Station  42A 


II 

. 724B7 

„2 

43892 

7B383 

■  3 

. 71902 

97272 

79713 

<4 

7C575 

. 94392 

75351 

.  9B309 

>5 

70145 

95S90 

.  78355 

.  99633 

9939J 

16 

54981 

.  90619 

. 69489 

,  90749 

.  92962 

91636 


(vi)  Correlation  Coefficients  for  Station 


100X 


>  1 

. 83416 

i2 

44632 

69560 

i3 

76272 

.89162 

.  B2297 

14 

-  67455 

-  79993 

-.  83231 

-.  98141 

i5 

81099 

. 93039 

.  79869 

. 99185 

-.  94930 

16 

.  88684 

904  1  7 

.  55097 

.  73029 

-  62910 

II 


764  16 
l4  l5 


(vjj)  Correlation  Coefficients  for  Station  256A 


1 1 

79504 

i2 

47567 

81139 

>3 

. 80099 

9B389 

.  B2725 

»4 

65327 

91358 

. 70124 

. 87405 

i5 

78159 

.  9B751 

81796 

.  99082 

. 93046 

16 

B1BB5 

.  95091 

76557 

.  94277 

B3C54 

93o45 

V 

il 

■S 

i3 

l4 

i5 

(viii)  Correlation  Coefficients  for  Station  262A 


1 1 

10193 

■3 

-  34653 

.  79B22 

•  3 

01137 

98585 

.  82725 

l4 

08387 

98902 

81765 

990B3 

.5 

03900 

98429 

82977 

.  99755 

99640 

■  6 

14587 

95217 

75042 

926B9 

.  94005 

93066 
il        i2        >3        i4        i5 
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Table  5.17  (continued) 


D.  RURAL  MAJOR  COLLECTOR 


(i)  Correlation  Coefficients  for  Station  59A 


.1 

84968 

x2 

52371 

75334 

x3 

.  74815 

.  97079 

.  B2725 

.4 

. 84250 

.  98524 

.  82747 

.  97700 

«5 

B1155 

.  9846" 

.  83672 

.  98977 

99731 

x6 

B0040 

95242 

.  77553 

93384 

94097 

x3 


94503 


i5 


(ii)    Correlation    Coefficients    for    Station    200X 


1 1 

79710 

x2 

43441 

64980 

i3 

59079 

94177 

74156 

x4 

74701 

.  98249 

74978 

. 96985 

x5 

64486 

95903 

. 75746 

.  99567 

.  987 IB 

16 

86823 

81529 

. 42369 

. 64052 

.  75357 

.  68577 

V 

>1 

i2 

i3 

i4 

i5 

(iii)    Correlation    Coefficients    for    Station    5420A 


II 

. 63468 

x2 

.  3B976 

76977 

x3 

.  58263 

96309 

.  82725 

x4 

.  60278 

B4599 

.  47167 

. 70B71 

x5 

62012 

9B53B 

.  79146 

.  9B796 

.  80863 

x6 

. 82502 

93209 

.  70946 

.  91203 

75721 

II 


i3 


92576 


«5 


(iv)  )  Correlation  Coefficients  for  Station  7047A 


II 

.  25B54 

x2 

.  63090 

x3 

.  39755 

x4 

-.  72214 

x5 

-.  15553 

«6 

. 49843 

.  77767 

.  955S2  .  82725 

-.  66272  -.  80579   -.  B2225 

.  862B0  .  47456    .  75134 

.  B8997  .  74353    .  9136B 


-.  24244 
-.  71662 


73136 


(*)  For  definition  of  variables,  see  Table  4.1; 
represents  X  ,  where  i  -  1  to  10  and  13. 


Xi 
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the  predictor  variables  (X's)  were  found  to  increase  over 
that  period.  As  a  result,  the  X's  were  less  effective  in 
explaining  AADT  for  stations  262A,  279A  and  7047A.  Thus, 
if  the  historical  data  of  AADT  for  a  point  or  section  of 
highway  for  which  a  forecast  is  desired  are  available, 
then  the  extrapolation  of  the  plot  of  AADT  against  time  at 
future  year  will  detect  any  unreasonable  value  of  future 
AADT  computed  from  the  forecasting  model(s).  If  the 
change  in  AADT  is  not  significant  over  a  period  of  time, 
then  it  will  be  reasonable  to  assume  that  the  future  value 
of  AADT  will  not  be  changed  significantly.  In  that  case, 
using  predictor  variables  that  increase  significantly  over 
a  -period  of  time  will  overestimate  the  future  year  AADT. 
Then,  simple  extrapolation  of  the  plot  of  AADT  against 
time  will  provide  better  results.  In  spite  of  reduced 
effectiveness  of  individual  X'6  to  predict  Y  for  stations 
262A  and  279A,  further  analyses  have  been  carried  out  for 
these  stations  because  combination  of  X's  may  provide 
better  results  for  some  stations. 


It  was  noticed  during  the  aggregate  analysis  that  the 
case  A  stepwise  regression  with  default  parameters,  as 
defined  in  section  5.3.1.2,  has  little  control  over 
variable  selection  and  almost  all  the  variables  were 
entered  in  that  case  (see  Table  5. A).  As  a  result,  only 
case  B  and  case  C  stepwise  regressions,  defined  in  section 
5.3.1.2,  were  carried  out   for   the   stations   under   this 
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disaggregate  analysis  with  the  help  of  the  SPSS  package 
[39].  But,  the  case  A  stepwise  regression  was  done  only 
for  those  stations  for  which  no  variables  remained  in  the 
equation  after  the  case  B  and  case  C  stepwise  regressions. 
The  summary  of  the  stepwise  regression  analysis  is  shown 
in  Table  5.18. 

2 
The   C  -statistic,   R  ,   etc.   in   the   all   possible 

regressions   were   calculated   with   the  help  of  a  program 

2 
"DRRSQU"  [42].   Some  of  the  selected  values  of  C    and   R 

are   presented   in  Table  5.19.   Variable  X   (Year)  and  its 

combinations  with  other  X-variables  were  not  presented   in 

Table   5.19.    The   variable   X   was  kept  only  for  graphic 

residual  analysis.   Moreover,  year  as  a  predictor  variable 

is   not  suitable  because  it  will  always  increase,  which  is 

not  true  for  AADT  (Y).   The  values  of  the  other  X's  in  the 

data   tables   could   be  increase  or  decrease,  as  AADT  does 

over  the  years.   Moreover,  the  effect  of  X_   is   reflected 

in  some  other  X's. 


5.4.2  Preliminary  Screening  of  Candidate  Variables 

The  screening  of  the  variables  has  not  been  done 
solely  on  the  basis  of  statistical  analysis.  Subjective 
judgments  regarding  the  questions  in  Table  5.6  have  always 
been  included  in  the  selection  process,  as  was  done  during 
the  aggregate  analysis.  Introduction  of  subjective 
judgment   into   the   forecasting   process   is   one   of  the 
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Table  5.18 

Stepwise  Regression  Summary 
(Disaggregate  Analysis) 


Mign.ay 
Category 

ATR 
Station 

Step 

variable 
subscript 

F 

value 

Signifi- 
cance 
level 

last 

Step 

trcoeff . 

(") 

R 

Squared 

Overall 

r 

Overall 
signifi- 
cance 

En- 
tered 

Re- 
Roved 

RURAL 
INTERt.TftTf 

172A 

Bee 

1 
2 
3 

7 
1 

13 

30.421 

76.767 

9.275 

0.C 
0.0 
0.014 

.016E 
-2.2489 

6.9492 

0.734 
0.969 
0.985 

3C.421 
158.364 
196.032 

0.0 
0.0 
0.0 

Constant 

terr, 

-13999.34 

3070A 

p  e  c 

1 
2 
3 
4 

1 

i 

2 
4 

27.861 

12.626 

6.310 

0 .  331 

0.0 

0.005 

0.033 

0.5?9 

-194.2593 

.7796 

0.717 
0.875 
0.926 

0.924 

27.861 
34.966. 
37.792 
6C.575 

0.0 
0.0 
0.0 

o.c 

Constar  t 

tern 

-6035.972 

5474A 

Bee 

l 

4 

80.320 

0.0 

1.1888 

O.880 

80.320 

CO 

Constant 

terr 

-34973.79 

Rl'PftL 

PRINCIPAL 

ARTERIAL 

68A 

Bee 

1 
2 

7 
2 

234.112 

17.697 

0.0 
0.002 

.0019 

-25.7583 

0.955 

0.984 

234.112 

305.583 

0.0 
0.0 

Constant 

tern 

747.0824 

134A 

B  6   C 

1 
2 
3 
4 
5 

6 
2 
7 

9 

6 

10.143 

12.651 

4.989 

0.948 

9.0*9 

0.009 
0.005 
0.052 
0.356 

0.015 

-87.6167 
.0049 

-.0104 

0.480 
0.770 
0.852 
0.837 
0.919 

10.143 
16.768 
17.301 
25.611 
33.874 

0.009 

o.oe: 

0.0 
0.0 
0.0 

Constant 

tern 

15267.25 

173* 

B  6  C 

1 

13 

450.373 

0.0 

3.5697 

0.975 

430.373 

0.0 

Constant 

tern 

-3992.541 

25  4B 

Bee 

1 
2 

13 
7 

25.723 

8.150 

0.0 

0.017 

8.8341 
-.0051 

0.70O 
0.835 

25.723 
25.296 

0.0 

Constant 

tern 

-9374.137 

Rim 

MINOR 
ARTERIAL 

25A 

B 

1 
2 
3 

1 
2 
3 

25.609 
12.528 

3.710 



0.0 

0.005 

0.086 

.1954 
(-1C.711) 

(-67.303) 

0.700 
0.867 
0.906 

25.609 
32.489 
28.766 

0.0 
0.0 
0.0 

Constant 

tern 

(132329.) 

c 

1 
2 

1 
2 

25.609 

12.528 

0.0 
0.005 

.1286 

-24  9935 

0.700 
0.867 

25.609 

32.489 

0.0 
0.0 

Constant 

tern 

1492.571 

279A 

B 

1 
2 
3 
4 

2 
6 
5 

2 

9.134 

12.740 

3.963 

0.009 

0.012 
0.005 
0.078 

0.928 

.0392 
-.0949 

0.454 
0.760 
0.833 

0.833 

9.133 
15.811 
14.985 
24.945 

0.012 
0.001 
0.001 
0.0 

Constant 

tern 

6534  579 

c 

1 
2 

2 
6 

9.154 

12.740 

0.012 
0.005 

-28.5409 

.0186 

0.454 

0.760 

9.133 

15.811 

0.012 
0.001 

Constant 

tern 

«92   49' 

Table     5.18    (continued) 
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Kighua, 
Category 


Stat  ion 


case 


Step 


Uanable 

subscript 


En- 
tered 


Re- 
newed 


Signifi- 

sric* 
leuel 


last 

step 

b-coeff. 
(ii) 


R 
Squared 


Outran 

F 


Outran 
signifi- 
cance 


36  « 


319ft 


U2h 


188H 


256fi 


262£ 


21.652 
7.589 
6.766 
6.25'j- 


.661 
.82* 
.857 
.873 


.2561 

-ie.ie.7i 

.  1955 


.657 

.f.&5 

.917 


21.852 

26.626 
26.519 
22. 8  Hi 


Constant        tel'r 


(125786.) 


21.852 


.881 


.236: 

-17.92*' 


.657 
f.ff. 


21.852 
2f<  626. 


Cor.itarit       tern 


l.-^OtJ.i'/  ,J 


t  4  C 


6 


ui.7u: 


e.8 


.8899 


701 


61.7k' 


Constant   tern 


761.2511 


I  i  C 


1    1 


Q  or> 


.812 


i.u!:.s;.dr:!. 


tern 


E  &  C 


1 


6 


constant   tern 


E  H: 


tit 


e 


Constant   tern 


Constant   tern 


1.581 
8.681 

1.886 


6.8 


.86'1 


.26* 
.8-5 
.876 
.3S 


.8279 


211*7.176 


.  ■>*.*  «- 


t1S5.917 


.  1662 


26111.796 


-11.1*293 

(8.2365) 

(-71.635) 


(13S352.) 


671 


.128 
.527 
.673 


O  W.i 


%.*-:. 


S=:f. 


1.581 
5.578 
6.166 


(NO  UftftWBLES  REflfilH     IM  T>tE  EQuflTlOfl) 


.881 
8.8 
6.8 
8.8 


.881 


8.8 


.H"» 


8.6 


.266 

.82U 
.815 

.829 


Table     5.18     (continued) 


HigfiHi-, 

HTF 

C-oSe 

step 

Variable 

F 

i  19rilf  !- 

lift 

R 

Ooeral! 

Overall 

Categorj 

Station 

( '  .■ 

sutjcript 

value 

leuel 

stef 

ti-tOfcff . 

Squared 

F 

r  1<""M  -M  _ 

cance 

En- 

tfe- 

tered 

noued 

(,-M 

1 

1 

28.563 

6.? 

.722 

2c.  zr 

8.8 

i 

2 

J 

r     <;«;tt 

.8:7 

-128.1981 

.82U 

2:.Wi5 

8.6 

U 

j 

k 

S.U57 

.8* 

.2672 

.8?o 

28.622 

8.8 

K 

B 

D 

1 

2.882 

.128 

.cc: 

25.62k 

6.6 

R 

L 

c 

2 

5.983 

.83o 

-19.7815 

.699 

26 .  829 

8.6 

Constant       tern 

231*1.55 

59fi 

1 

1 

28.565 

fe.fc 

.1213 

.722 

2S.K--5 

8.6 

c 

2 

j 

5.818 

.837 

-IP'S    5F1'.- 

.62tt 

2".uur 

e.e 

Constant      tern 

217571.57 

n 

R 

: 

c 

2H'i:. 

E  d.  [ 

1 

6 

2&.U97 

.861 

.O.llSl? 

.7511 

2U.U97 

.681 

Constant        tern 

681tt.  19*  1 

R 

1 

6 

23.M7 

.801 

.2Ut2 

.681 

23.W7 

.881 

f 
0 

L 

5k26ri 

B  &  C 

2 

5 

11.596 

.687 

-e;.:59^ 

.853 

26.? Ik 

8.6 

Constant      'tern 

t2» 7^   " 

i 

k 

11.9«e 

.885 

-.86711 

r*v« 

11.988 

.885 

L 

2 

o 
j 

3.38( 

.699 

-16.1581 

.6ue 

8.9*; 

.88E 

E 

B 

3 

E 

18 .669 

.<•-*• 

.H623 

15.251 

.6*1 

c 

T 
0 
R 

78U7fi 

1 

x 

-  "f  1 

.82? 

•5    ">  -  f. r 

_91? 

22.9?? 

8.8 

Constant      tern 

3325U.73 

r 

1 

k 

n  j;  j 

.  885 

-.8UC-' 

r '  h 

<  <    Q :  : 

N'.r 

Constant      tern 

1122.8611 

(«)  ca;e  ft:    (fill  Default  Parameters) 

1.  n«i.  ho.  o*  Steps  »  2  ■  Mo.  of  independent  Variables 

2.  FIH  =  .81;  F0L,T  =  .885 
8.  Tolerance  leue:  z  .661 

Care  6: 

1.  nai-:.  mo.  of  step.'  =  2  ■  ite.  of  independent  variables  (Default,' 

2.  FIN/FOOT  ■  F<.18  1  r,-pvuhere.  FIN  >  FOUT.    n  =  No.  of  oases.. 

p  s  no.  of  EHptCted  Paraneter  in  Equation. 

3.  Tolerance  level  =    .81 

C»:-»  C: 

StfE  K  CASE  B    Ewept  FIM/FOUT  =  F(JS,1.  n-n) 


(">  95i  Confiaen:e  Interual  of  tne  b-coefficient  in  (  ^    include:  :erc. 
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Table  5.19 

Selected   C   &  R-Squared  in  All  Possible  Regression 
(Disaggregate  Analysis) 


Highway 

ATR 

Subscript 

.s  of  variables 

C  p  Values  in  sane 

P-Squar 

ed  Values 

P 

Category 

Station 

in  Equation 

order 

in  sane 

order 

7. 

13.              8. 

13476,14722.14895, 

.734. 

.710. 

.706. 

2 

10. 

1.                  5 

17995.19354.22645 

.645.. 

619, 

.554 

1  7, 

5  7.           4  7, 

1547.   2407.   2597. 

.969. 

.952, 

.949. 

t 

7  9. 

5  13.          4  13 

3272.    4402,    4565, 

.935, 

.913, 

.910. 

5  8 

5937 

.883 

R 

172A 

u 

1   7  13, 

12  7,        17  8. 

760,   1314,    1335. 

.985, 

.974, 

.974, 

4 

p 

1  6  7. 

15  7.     1  7  10. 

1454,1456,      1467, 

.971,. 

971, 

.971. 

A 

1   4  7. 

17  9.        5  6  7. 

1497,    1511,    1772, 

.97C, 

.970, 

.965. 

L 

5  7  10. 

2  5  7.        4  5  9 

1896,    1932,    4686 

.963, 

.962, 

.90? 

Subscripts 

of  all  variables 

12 

3 

.000 

12 

1. 

7.           13. 

34.60,34.82,34.86, 

.717, 

.716. 

.715. 

2 

8. 

4.              5, 

37.80,44.45.48.04. 

.696, 

.653. 

.63C. 

Q 

53.71 

.593 

2  8. 

2  4.            2  7. 

4.50,    4.75,   5.59, 

.925, 

.924. 

.916. 

3 

2  5, 

12.          2  13. 

7.94,12.27,12.40. 

.903, 

.875. 

.874. 

5  7. 

2  9 

18.44.21.42 

.835, 

.816 

I 

3070A 

N 

2  6  8. 

2  6  7,     2  8  10, 

0.18.   4.00,    4.04, 

.966. 

.942. 

.941. 

4 

T 

5  7  10. 

2  4  8.     2  7  10. 

4.20,    4.21,    4.25, 

.940, 

.940. 

.940, 

E 
R 
S 

1  2  7. 

2  4  10.       2  5  8 

5.31.   5.66,   5.74 

.933, 

.931, 

.93 

T 
A 
T 
E 

Subscripts 

of  all  variables 

12 

.994 

12 

4. 

1.               7. 

72.0.220.2,265.1, 

.880, 

.659. 

.593, 

2 

11. 

5.                 8 

290.2.295.9,300.1 

.555. 

.547. 

.54 

1  9. 

2  4.           12, 

42.7.  60.7,  67.0. 

.926, 

.899. 

.890, 

3 

1  4. 

4  5.         4  13 

73.2.   73.3.   73.5 

.881, 

.881. 

.88 

5474A 

12  9. 

12  4.     7  9  10. 

3.32.25.07.31.29. 

.988. 

.955. 

.946. 

4 

1  9  13. 

15  9.       14  9. 

32.88.32.98.33.36 

.944. 

.944. 

.943 

| 

Subscripts 

of  all  variables 

12 

.999 

12 

Table     5.19     (continued) 


134 


Highway 

ftTR 

Subscripts  of  Variables 

C  p    values  in  sane 

P-SquareO  Values 

F 

Category 

Static 

in  Equation 

order 

in  sa^e 

erdt: 

7.              15,                8. 

55.0,    59.4,    73.6, 

.955. 

.952. 

.942. 

2 

1.                9.                5. 

148.0.180.5.260.2. 

.890. 

.867, 

.811. 

4.              10.                  6 

264.1,344.3,345.1 

.809. 

.752. 

.752 

2  7.           2  8.         5  13. 

16.12.20.04.31.20. 

.984. 

.981. 

.973. 

3 

R 

68A 

4  13.            5   7.            4  7. 

31.43.35.47.56.50. 

.973. 

.970. 

.970. 

U 
R 

ft 

7  9.            18 

37.61.75.45 

.969. 

.942 

2  7  8,        12   7,        2  5  7, 

13.04,15.23,15.98, 

.987, 

.986. 

.985. 

4 

L 

2  4  7.       2  7  9.       12  8 

16.37,16.59,20.56 

.985, 

.985. 

.982 

Subscripts  of  all  variables 

12 

.995 

12 

2  7.        5  13.          1  2. 

36.57.37.52.38.14. 

.837, 

.833. 

.831. 

3 

2  8,         9  13,         2  13, 

41.01.50.49.51.59. 

.820, 

.784, 

.78C. 

5  6,           18 

74.88.141.9 

.693, 

.442 

P 

154A 

P 

2   7   9,        2  5  7,        2  5  8. 

16.70.17.36.17.56. 

.919. 

.916. 

.915. 

4 

I 
N 

45  9,        28  9.          589 

24.08.26.30,26.75 

.891. 

.883. 

.8fl 

C 
I 
P 
ft 

Subscripts  of  all  variables 

12 

.99t 

22 

13.                7,                9. 

27.60.46.64.66.11. 

.975. 

.962. 

.949. 

•y 

L 

8.               1.                 5 

68.67.74.94,266.1 

.947. 

.943. 

.813 

2  9.            12.          9  13, 

12.32,75.34,27.75, 

.987. 

.976. 

.976, 

T 

173ft 

8  13.         4  13.         1  13. 

27.81.28.46.29.14. 

.976. 

.976. 

.975. 

5  13,            16 

29.53.67.54 

.975. 

.949 

2  9  10,        2   7  9,        12  9, 

2.38.    4.08.    5.83, 

.995. 

.994, 

.993. 

4 

ft 

2  9  13,        2  5  9,        2  4  9 

5.89,   6.28,8.07 

.993. 

.992, 

.991 

R 

T 
E 

R 

Subscripts  of  all  variables 

12 

.999 

12 

9,                7.                8. 

16.04,18.26,18.40, 

.652. 

.621. 

.619. 

2 

I 
A 

L 

1.                5.                  4 

19.27.19.90,25.77 

.607. 

.598. 

.517 

7  13.          1   13.          8  13. 

4.88,   5.31.    7.92, 

.835. 

.829, 

.793, 

3 

25  4E 

4  13.           2  9.           4  9. 

8.63,10.52,11.44. 

.783. 

.757. 

.744, 

5  13.           18 

12.79,20.39 

.725. 

.619 

2  5  9.     4  7  13.       2  4  9. 

4.64.   4.67.   4.76. 

.866. 

.866. 

.864. 

4 

1   4  13.     1   8  13.     12  13 

4.98.    5.54.    5.62 

.861. 

.854. 

.852 

Subscripts  of  all  variables 

12 

.986 

12 

Table     5.19     (continued) 
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Highway 

ftTR 

Subscripts  of  uariabies 

i     Values  in  sane 
p 

order 

R-Squared  Ualues 

P 

Category 

Station 

in  Equation 

in  sane 

order 

1.              U.              6. 

21t.56.S1.H3.S2.52. 

.798. 

.638.   .629. 

2 

5 

1*1.55 

.5H6 

1  2,          2  U,          1  5, 

7.91,  8.1b    8.83. 

.867, 

.8611,    .858, 

3 

2  5,          U  5,          1  U 

28.63,21.113,23.62 

.750. 

.7U6,    .726 

25ft 

12  5,      2  it  5        1  K  5.. 

7.11,  7.3*    8.72, 

.892, 

.896,   .877, 

U 

R 

u 
P 

1  2  U,       12  6,       2  It  6 

8.98,  9.7U,  9.77 
7 

.875, 

.86.8,   .868, 
.9U6 

7 

Subscript?  of  all  uari%t'lef 

5  6,          1  5,          It  6, 

8.33,   1.79,  2.87, 

.833, 

.866. .793, 

3 

L 

1  6,          2  6,          1  2, 

2.82..  3.5tt,  6.52, 

.776. 

.766,-692. 

2?9fl 

2  U,          2  5 

7.17,18.3U 

.677, 

.695 

«  5  6,      2  5  6,       1  2  5, 

2.19,  2.32,  2.55, 

.836, 

.833,   .828, 

Li 

14  5.       1  U  6.       12  6 

3.16.  3.73,  It. 27 
7 

.81*. 

.891.  .769 
.863 

7 

Subscripts  of  all  variables 

U,              6..              1, 

28.  16,21. 18,23.  »1, 

.657, 

.6W,     .619 

2 

5 

25.83 

.59 

n 

i 

1  2,          2  6,          2  U, 

9.81,  9.51,  9.5?, 

.812, 

.896,   .865, 

t 

H 

2  5.          15,          1  It 

13.119,16.39,21.86 

.759. 

.725.  .66 

0 

361ft 

ft 

2  k  6,      12  6,      2  5  6, 

5. Sit,  6.17,  8.91, 

.872, 

.6*15,   .636, 

It 

It  5  6,       12  5..       1  2  ll 

9.58,18.12,18.18 

7 

.828 , 

.823,  .821 
.929 

7 

Subscripts  of  all  variables 

R 

6,             1.             U, 

5.83,  9.93,1U.11 

.791. 

.7311.  .675, 

2 

R 

5 

17.17 

.632 

T 

E 

1  5,          2  6,          1  2, 

It. 65,  5.61,  6.13, 

.8IUt, 

.823,  .815, 

3 

R 

5  6,          t  6,          1  6, 

6.56,  7.23,  7.62, 

.899, 

.896,   .79U, 

I 

1  U,          2  U 

11.88,11.97 

.7H6, 

.733 

ft 

319ft 

L 

12  5,       1  It  5,       15  6, 

3.U8,  ».87,  5.92, 

.881, 

.86.1,   .6*6, 

» 

12  6.      2  5  6,      1  2  U 

7.88.  7.55,  7.8it 
7 

.838. 

.82lt,  .819 

.916 

t 

Subscripts  of  all  variables 

Table     5.19     (continued) 


Highway 

Category 

ATT' 

station 

Sut'sotipt;  of  Uanables 
in  Equation 

Co   Values  in  sane 
orae»' 

R-Squarea  Ua lues 
iri  sane  order 

R 

R 
U 

R 
A 

L 

n 
I 

N 

0 

R 

A 
R 
T 

E 
R 
I 
A 

L 

BURAL 

nwoc 

COLLEC- 
TOR- 

k2fl 

1.           k,            5 

1  6,          k  6.           1  2, 

1  k,          15,          2  k 

12.73,13.87,1k. 12 

12.86,12.k  1,12.95, 
*.55, 1k. 71,15.8k 

7 

.525,.»9S,     .k92 

.5*.   .5*1,    .569 
.536,  .526,   .518 

.»li 

2 
3 

Sut'scripts  of  all  variables 

te*.- 

6.             1,              5 

k  6,           1  6           2  6, 
2  5,          k  5,          12 

6.91, 13.2k, 15.9 

7.3*.   8.51.   8.73, 

16.86, 11. k«, 12.61 

7 

.786,.69fc,     .656 

.8*9,   .792.    .769. 
.776,  .756,   .731 

.928 

3 

>L:;:rip*.i  of  ail  uariables 

256ft 

6,             1,               5 

2  6,          12           2  5, 
16,          k  6            1k 

k.26,5.61,  6.6.6 

U.66     k.M    5.k* 
6.15,  6.17,6.51 

7 

.671,.632,     .611 

.726.    .716,   .692, 
.673,  .673,  .66k 

.851 

2 

3     ■ 

Sut'script;  of  all  uanaCle? 

262ft 

2  a,         12,         2  6 

2  *  5,       ill  6.       1  2  6. 
12  k,       12  5,      2  5  6 

t;.i;  13  m  «.39 

■».?('  ft.k8,H.99, 

■6.66, 15.  U9, 15.65 

.527,   .515,   .k97 

.629,    .SHU.    .538 
.529,  .516.   .515 

.859 

3 
k 

7 

Subscripts  of  all  uariauie; 

78U7R 

1 

5  6,          11*           »5, 
2  5,           15           2  it 

2  5  6.       1  k  6.      15  6. 
15  6.       1  k  5.       12k 

11.79 

k.67,  6.25,  7.85, 
1*. 89, 16. 23, 16. 27, 

1.36.  3.26.  3.k2. 
k.3f>,  6.93,  7.35 

.55k 

.81*,   .778,   .737, 
.593,  .5*6,   .565 

.911.  .672.   .869. 
.851.   .797,   .766 

.91? 

2 
3 

k 
7 

Su^crirts  o*  all  variables 

Table     5.19     (continued) 
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Highway 
Category 

ATT: 

Station 

su&scripts  of  uariables 

in  Equation 

C     ualues  in  sane 
ord*r 

R-Squared  uaiues 
iri  sane  order 

P 

R 
U 
R 
R 
L 

n 

& 

0 
R 

C 

c 

L 
L 
E 

C 
T 
0 
R 

59h 

1,            2,             k 

k  5.          ill,          1  2, 
15,          2  5,          111 

2  it  5,       1  K  5,      US  6, 
1  2  it,      2  It  6,       12  6 

21. 16,22. US,  29.98 

7.98  ft.  1i,19.76, 
26.91,21.36,23.85 

5.28,  6.38,  6.83, 
Hi.  61, 16. 16, 21. 7 

7 

.722,  .716.   .6H 

.86-?.    .885,    .753, 
.7U3,  .739,   .723 

QFtr,       5Q6       566 

.817..866,     .75U 
.9U5 

3 
It 

7 

SuPscripts  of  all  uariables 

286K 

6,            1,             it 

k5,          15            16, 
k  6,          12..          2  it 

k  5  t,       1  it  5.       2  it  S, 
12  5,       15  6,       1  it  6 

33.85,55.93,69.13, 

13.95 ,27 .69,33.81. 
3l4.U6,55.93,6li.99 

18.93,12.78,111.38, 

23. 16, 25. 33, 35. 6H 

7 

.75U,   .635,   .55? 

.89lt,   .81k,    .776. 
.77U,  .647,   .59i| 

$i-lj        Q"P       Qftli 

.852,-839,     .779 
.932 

2 

Ii 

7 

Subscripts  of  all  uariables 

5U28fi 

6 

5  6,          16,          2  6, 
1  6 

1*6,      It  5  6,       12  6, 
2  5  6,      15  6,      2  ft  6 

6.2U 

1.35,   1.67,  it.52, 
8.15 

1.13,  2.32,  2.82, 
2.67,3.12,    6.26 

7 

.681 

.825,  .818,   .756, 
.652 

.872,  .8U7,   .636, 
.8:35,-6361,     .76U 

.87lt 

3 

It 

7 

Subscripts  of  all  uariables 
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suggestions  made  by  Armstrong  in  his  critique  of  common 
practice  [A,  5].  The  first  four  goals  in  Table  5.7  were 
considered  in  the  preliminary  choices  of  candidates  for 
the  final  selections.  As  a  general  rule  [15],  the  third 
goal  (i.e.  number  of  X-variables  in  model)  does  not 
support  more  than  2  variables  in  the  equation  (See 
footnote  of  Table  5.7).  In  these  preliminary  choices,  the 
third  goal  was  relaxed  for  some  stations  in  order  to 
satisfy  other  goals  in  Table  5.7.  In  preliminary  choices, 
similar  kinds  of  diagnoses,  as  were  done  in  the  case  of 
the  aggregate  analysis  in  sections  5.3.2.1  to  5. 3. 2. A, 
were   carried   out   for   the  stations  under  investigation. 

The  results  of  the  preliminary  screening  process  are  shown 

2 
in   Table   5.20.   The  statistical  results  on  R   value,  C  - 

criterion,    stepwise     regression,     and     correlation 

coefficient   were   not   considered   alone   in   making   the 

preliminary  choices.  Subjective   judgments   regarding   the 

questions   in   Table   5.6   were   also   involved   in   these 

preliminary  choices. 

5. A. 3  Final  Selection  of  Variables 


The  final  model  selection  for  each  station  was  made 
from  the  preliminary  choices  of  Table  5.20  by  examining 
the  goals  of  the  analysis  in  Table  5.7.  The  goals  1  to  A 
in  Table  5.7  were  considered  during  the  preliminary 
screening  process.   The  signs  of   regression   coefficients 


Table  5.20 
Preliminary  Choices  of  Disaggregate  Analysis 
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Highway 
Category 

ATR 
Station 

Choice 

Numbers 

Subscripts  of  Variables 
in  sane  order   as 
Choice  Number; 

RURAL 
INTERSTATE 

172A 
3070A 

5474A 

1,    Z      3,    4,    5 

1,    2,    3,    4     5, 
6,    7,    e,    9,    10 

1,    Z,    3,    4 

7,    8,    1   7,    4  7,    1   7   13 

1.    4,    7,    8,    2   4,    2   7, 

2   6.    2  6   7.    2  €   8,    2   6   10 

4.    1   9,    2   4.    12   9 

RURAL 

PRINCIPAL 
ARTERIAL 

68A 

134A 

173A 

25  45 

1,    Z.    3 
1,    2.    3,    4 
1,    2,    3,    4,    5 

1,    2,    3,    4,    5,    6 

7,    Z   1,    2   8 
1    2,    2   7,    2   8,    2   7   9 
7,    8,    1    2      4  13.    2   7   9 
1,    4,    7,    8,    7   13,    8  13 

RURAL 
MINOR 

ARTERIAL 

25A 

279A 

301A 

319A 

42A 

100X 

256A 

282A 

1-    2.    o.    4 
1 ,    ^ 

1,    2..    3,    4 
1,    2,    3,    4..    S 
1.    2.    3.    4 
1,    2,    3,    4,    5 
1,    2,    3,    4 
1,    2 

1.    4.    12,    2   4 
1   6,    2  6 
1,    4,    1    2,    2   4 
1,    4,    6,    1   2,    2  4 
1,    4.    1    2,    2    4 
1,    6,    1   2,    2  6,    4  6 
1,    6.    1    2,    2   8 
1    2,    2   4 

RURAL 
MAJOR 

COLLECTOR 

59A 
2Q0X 
5420A 
7047A 

1,    2,    3,    4,    5 
1,    2,    3,    4,    S 
1,    2,    3 
1 1    <- 

1,    4,    1    2,    1    4,    2   4 

1,    4,    6,    1    2,    2   4 
6,    1   6,    2   6 
4,    1    4 

140 

were  checked  through  the  regression  on  preliminary  choices 
of  Table  5.20.  Final  selection  for  each  station  was 
determined  by  examining  all  the  criteria  of  Table  5.7 
except  the  residual  analysis  and  hypothesis  testing 
concerning  b-coef f i cient s  .  Graphic  residual  analysis  and 
tests  concerning  regression  coefficients  were  carried  out 
on  the  final  selection  before  transforming  into  model 
according  equation  3.1  of  Chapter  3.  Residuals  plots  were 
done   to   check.   whether  some  key  independent  or  predictor 

variables  could  provide  additional  predictive  power  to  the 
models  developed.  The  tests  concerning  regression 
coefficients  were  done  to  confirm  that  the  variables  in 
the  models  are  statistically  significant. 


5.4.3.1  Regression  on  Preliminary  Choices  and 

Final  Selection 

Regression  on  the  preliminary  choices  was  carried  out 

with   the   help   of  the  SPSS  package  [39].   The  summary  of 

that  analysis  is  shown  in  Table  5.21  for  all  the   stations 

under   analysis.    The   magnitudes   of   b-coef f i cients  and 

their  inconsistency  with  respect   to   sign   are   shown   in 

2 
Table   5.21,   together   with   R  ,   overall   F   value,   and 

significance  of  the  choices.   Table  5.21   also   shows   the 

variables   for  which  the  95  percent  confidence  interval  of 

the  b-coe f f i c i en t s   includes   zero.    To   find   the   final 

selection,   in   which   the   95  percent  confidence  interval 

does  not  include  zero  was   preferred.    The   diagnoses 

carried   out  on  the  choices  for  the  stations  in  Table  5.21 


Table  5.21 

Multiple  Linear  Regression  Summary 
on  Preliminary  Choices 
(Disaggregate  Analysis) 
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Highway 

ATP 

Choice 

Variable 

b-coefficient  in  sane  oroer 

Inconsis- 

R 

Overall 

Overall 

Category 

Station 

tionbej 

Subscript 

tencies  In 
b's  (••) 

Squared 

F 

Signifi- 
cance 

1 

7 

0.0038 

... 

0.734 

30.421 

0.0 

R 

2 

8 

0.0171 

— 

0.706 

26.476 

o.c 

U 

172A 

J 

1.7 

-2.0297,     0.0201 

-bl 

0.969 

158.365 

0.0 

c 

4 

4.7 

-1.9864.     0.0099 

-b4 

0.949 

92.514 

0.0 

A 

L 

5 

1.7.13 

-2.748$.      0.0166. 

6.9492 

-bl 

0.98S 

196.032 

0.0 

1 

1 

0.2352 

... 

0.717 

27.861 

0.0 

2 

4 

0.4000 

--- 

0.65J 

20.702 

0.0 

3 

7 

0.0030 

— 

0.716 

27.668 

0.0 

I 

4 

8 

0.0137 

— 

0.696 

25.204 

0.0 

N 

3070A 

5 

2.4 

-194.259.    0.7796 

— 

0.924 

60.575 

0.0 

T 

6 

2.7 

-150.500.   0.OC5C 

— 

0.918 

56.154 

0.0 

E 

7 

2-8 

-166.311.   0.0241 

— 

0.92S 

61.951 

0.0 

P 

8 

2.(6). 7 

-142.315.-0.5107. 

0.0068 

-b6 

0.942 

48.357 

0.0 

J 

9 

2.6.8 

-164.021.-0.7264. 

0.0364 

-b6 

0.966 

82.232 

0.0 

T 
A 
T 

10 

2.8.(10) 

-173.163.   0.0278. 

-0.0032 

-bio 

0.941 

48.123 

0.0 

1 

4 

1.1889 

— 

0.880 

80.320 

0.0 

E 

5474A 

2 

1.9 

0.6558.     -0.0108 

-69 

0.926 

62.672 

0.0 

3 

(2).  4 

-14.2176.   1.2842 

— 

0.899 

44.650 

0.0 

4 

1.2.9 

0.6O86.   -40.5611. 

-0.0076 

-b9 

0.988 

239.613 

0.0 

R 

u 

1 

7 

0.0016 

... 

0.955 

234.112 

0.0 

R 

68ft 

2 

2.7 

-25.7583.   0.0019 

— 

0.984 

303.583 

0.0 

A 

L 

3 

2.8 

-31.1802.   0.0092 



0.981 

258.892 

0.0 

1 

1.2 

0.0443.   -109.080 

... 

0.831 

24.547 

0.0 

P 

134A 

2 

2.7 

-116.511.   0.0026 



0.837 

25.611 

0.0 

R 

3 

2.8 

-122.754.  0.0121 



0.820 

22.778 

0.0 

I 
N 
C 

4 

2.7.9 

-87.6167.  0.0049. 

-0.0104 

-b9 

0.919 

33.874 

0.0 

1 

7 

0.0026 

— 

0.962 

279.343 

0.0 

I 

2 

8 

0.0119 

— 

0.947 

196.993 

0.0 

P 

173A 

3 

1.2 

0.4903.   -52.2181 

... 

0.978 

222.058 

0.0 

ft 

4 

(4). 13 

-0.0592.     3.5844 

-D4 

0.976 

201.959 

0.0 

L 
A 

5 

2.7.9 

-47.5616.   0.0011. 

0.0080 

— 

0.994 

482.024 

0.0 

1 

1 

0.1495 

— 

0.607 

17.013 

0.0 

P. 

2 

4 

0.3870 

— 

0.516 

11.771 

0.006 

T 

2546 

3 

7 

0.0013 

— 

0.621 

18.045 

0.001 

E 

4 

8 

0.0059 

— 

0.619 

17.901 

0.001 

P 

3 

7.13 

-0.0051.     8.8341 

-b7 

0.835 

25.296 

0.0 

I 
A 

L 

6 

(8).13 

-0.0166.     6.7564 

-68 

0.793 

19.127 

0.0 

Table     5.21     (continued) 
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Highway 

RTP 

Choice 

Variable 

b-coefficierit  in  sane  oroer 

inconsis- 

P 

Overall 

Overall 

Category 

station 

Number 

Subscript 
(*) 

tencies  in 
b' s  (*+) 

Squared 

F 

Signifi- 
cance 

1 

1 

.0824 

.700 

25.609 

0.0 

p 

25R 

L 

4 

.1695 

— 

.638 

19.407 

0.0 

U 

V 

1.2 

0.1286.   -24.9935 

— 

.867 

32.48  5 

CO 

Ft 

ft 

L 

4 

2    4 

-32.2458,       .3067 

— 

.864 

31.858 

0.0 

279H 

1 

1, 6 

-0.0337,         .0521 

-bl 

.776 

17.331 

.001 

L 

2   6 

-28.540?        .0186 

— 

.760 

15.811 

.001 

1 

1 

.080  6 



.619 

17.838 

.001 

301R 

*> 

c 

4 

.1505 

— 

.657 

21.052 

.001 

3 

1.2 

0.1423.   -22.2131 

— 

.612 

21.530 

0.0 

4 

2,  4 

-17.9201,       .2363 

— 

.805 

20.626 

0.0 

1 

1 

.0471 



.734 

30.311 

0.0 

2 

4 

.1688 

— 

.675 

22.844 

.001 

M 

319fl 

3 

6 

.0900 

— 

.791 

41.743 

0.0 

I 

4 

1,(2) 

0.0695     -16.3968 

— 

.815 

22.069 

0.0 

N 

5 

(2), 4 

-13.9513,      .2405 

— 

.733 

13.740 

.001 

0 

R 

1 

1 

.0279 

— 

.525 

9.965 

.012 

42P 

z 

4 

.0533 

— 

.495 

e.9'-: 

.015 

3 

1,(2) 

0.0381,    -12.348"1 

— 

.569 

S.276 

.035 

4 

(.2), 4 

-7.9152,         .0656 

— 

.516 

4.299 

.054 

1 

1 

.0168 



.696 

22.876 

.001 

9 

I 

6 

.0242 

— 

.7?e 

36.837 

0.0 

1 0  OX 

3 

1.(2) 

0.0205.    -17.2954 

— 

.731 

12.202 

.003 

4 

(2^    6 

-4.0496.         .0251 

— 

.789 

16.833 

.001 

ft 
R 

T 

5 

(4).  6 

-0.0081,        .0209 

-D4 

.609 

19.061 

.001 

1 

1 

.0819 



.632 

16.699 

.001 

E 

256ft 

2 

6 

.1642 



.671 

22.386 

.001 

R 

3 

1.(2) 

0.1233,      -6.4128 



.716 

12.612 

.002 

I 
R 
L 

4 

(2). 6 

-6.1982,         .2203 



.726 

13.232 

.002 

262fl 

1 

1.  2 

0.0356,    -12.9828 



.515 

5.309 

.027 

I 

:.  4 

-13.7883,      .0872 



.52( 

5.570 

.024 
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Table     5.21     (continued) 


Highway 

RTR 

Choice 

Variable 

D-coefficient  in  sarce  order 

Inconsis- 

R 

Overall 

Overall 

Category 

Station 

N  un  D  e  r 

Subscript 
(*) 

tencies  in 
b's  (**) 

Squared 

F 

Signifi- 
cant e 

R 

1 

i 

.0481 

722 

28.56? 

CO 

U 

z 

4 

.0850 

— 

.709 

26.906 

0.0 

R 

59R 

3 

1,(2) 

0.0596, 

-11.4951 

— 

15.267 

.001 

R 

4 

(1).(4) 

U.C379. 

.0165 

— 

.723 

13.047 

.002 

L 
M 

s 

('2>,  4 

-33.4952 . 

.1309 

— 

.805 

20.670 

0.0 

R 

1 

« 

.Of?? 

— 

.635 

13.940 

CO 

J 

L 

4 

.1080 

— 

.558 

10.101 

.013 

0 

Z  0  OX 

3 

e 

.0815 

— 

.754 

24.497 

.001 

R 

4 

1,(2) 

0.0571.. 

-5.9596 

— 

.64^ 

6.436 

.026 

c 

0 

5 

(Z).  4 

-11.8313. 

.1391 

— 

.594 

5.123 

.043 

L 

1 

6 

.1163 

— 

.681 

£3.447 

.001 

L 

5  41- OR 

Z 

1,* 

-0.0917. 

.2509 

-M 

.oio 

22.501 

0.0 

E 

c 

T 

3 

(2), 6 

-12.8709, 

.1559 



.756    , 

15.700 

.001 

7047R 

1 

4 

-.0433 



.531 

11.966 

.005 

0 

2 

(D.4 

-0.0091. 

-.0590 

-Dl 

.608 

7.749 

.CC9 

R 

(*)  95*  confidence  interval  of  the  c-coefficent(s)  for  the  variable(s) 

enclosed  in  first  bracket    includes  zero. 
(**)  '+'  and  '-'  sign  with  b-coefficient(s)  are  inconsistent  with  the 
expected  result. 
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to  find  the  final  selection  for  each  station  --  were 
similar  to  those  done  earlier  for  aggregate  analysis  in 
Section  5.3.3.1.  The  final  selections  from  this 
disaggregate  analysis  are  shown  in  Table  5.22.  In 
general,  not  more  than  two  X-variables  were  taken  for  the 
final  selection. 

5.4.3.2  Graphic  Residual  Analysis  on  Final  Selections 

The  stations  3070A,  68A,  301A,  7047A  —  one  from  each 
of  the  four  highway  categories  —  were  picked  through 
random  sampling.  The  residual  plots  of  these 
representative  stations,  shown  in  Appendix  E,  were 
generated  by  the  BMDP  package  [47].  The  plots  of  other 
stations  were  found  similar  to  the  plots  of  these 
representative  stations.  The  residual  plots  against  the 
predicted  AADT,  the  final  selected  predictor  variable(s), 
and  the  "year"  are  presented  in  Figures  El.l  to  El. 4, 
E2.1.1  to  E2.4.1,  and  E4 . 1  to  E4.4,  respectively,  in 
Appendix  E . 


The  normal  probability  plots  of  residuals  are  given 
in  Figures  E3.1  to  E3.4  in  Appendix  E.  These  plots  appear 
reasonably  close  to  straight  lines  and  indicate  that  error 
terms  are  approximately  normally  distributed.  The  random 
pattern  of  plots  of  residuals  against  the  fitted  response 
variable  and  predictor  variables  (Figures  El.l  to  E2.4.1) 
indicate  no  ground  for  suspecting  the   appropriateness   of 


Table  5.22 


Final  Selection  of  Disaggregate  Analysis 
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Highway 

RTR 

Variable 

b-coef ticient  in 

Inconsis- 

R 

Overall 

Overall 

Category 

station 

subscript 

sane  order 

tencies  in 

squares 

F 

signifi- 

b*s (*) 

cance 

172fl 

8 

.0171 



.706 

26.478 

0.0 

RURAL 

3C70A 

2,6 

-166.311,       .0241 

— 

92* 

61.9-51 

0.0 

INTERSTATE 

5474A 

4 

1.188S 

— 

.880 

80.320 

0.0 

68A 

7 

.0016 



.955 

234.112 

0.0 

RURAL 

134A 

2,7 

-116.511,       .0026 

— 

.637 

25.611 

0.0 

PRINCIPAL 

173A 

1       ^ 

0.49C3,    -52.2181 

— 

.978 

222.058 

o.c 

ARTERIAL 

Z54B 

.1495 

__ 

.607 

17.012 

0.0 

25fl 

1,2 

0.1266,   -24.9955 

— 

.867 

32.489 

0.0 

279A 

2,6 

-28.5409,      .0186 



.760 

15.811 

.001 

RURAL 

301A 

1    *> 

0.1423,    -22.2131 

— 

.812 

21.530 

0.0 

319ft 

.0471 

— 

.734 

30.311 

o.c 

MINOR 

iZR 

.0279 

— 

c^ 

9  of.5; 

.012 

1  OO'X 

.0168 

— 

.696 

22.876 

.001 

256A 

.0819 



.632 

18.899 

.001 

ARTERIAL 

262A 

1     9 

0.0356.   -12.9626 

— 

.515 

5.309 

.027 

59A 

.0481 

_ 

.722 

28.563 

0.0 

RURAL 

2  OCX 

.0503 

— 

.635 

13.940 

0.0 

MAJOR 

542  Ofl 

I 

.1163 

— 

.681 

23.447 

.001 

COLLECTOR 

7  047  A 

4 

-.0433 

— 

.521 

11.988 

.005 

(*)  '  +  '   and  '-'sign  with  b-coefficient(s)  are 
inconsistent  with  the  expected  result. 
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the   linearity   of  the  regression  function  or  constancy  of 
the  error  variance. 

Residual  plots  were  also  generated  against  variables 
not  included  in  the  model  to  check  whether  some  key 
independent  or  predictor  variables  had  been  excluded  from 
any  model.  One  such  variable  is  the  Year  (X  ),  which  was 
not  included  in  any  model.  The  plots  of  residuals  against 
X  ,  shown  in  Figures  EA  .  1  to  LA. A,  do  not  indicate  any 
correlation  between  error  terms  over  Year,  since  the 
residuals  are  random  around  the  zero  line. 


5.A.3.3  Testing  Hypothesis  Concerning  Regression 

Coefficients 

The  same  overall  F-test  and  partial  F-test   used   for 

the   aggregate  analysis  of  each  highway  category  have  been 

applied  to  each  station  separately.   The  results  of   these 

two   F-tests   are   shown   in   Tables   5.23   and  5  .  2A  .   The 

partial  F-test  for  one  variable  is  the  same  as  that  of  the 

overall   F-test.   The  overall  F-test  results  of  Table  5.23 

at  a-levels  of  0.05  and   0.10   show   that   the   regression 

relationships   between   the   predictor  variable(s)  and  the 

response  variable  exist  and  cannot  be  rejected   at   an   a- 

level   of   as  low  as  0.05.   The  partial  F-test  results  for 

those   stations   with   more   than   one   variable   in    the 

regression   equations   are  shown  in  Table  5.2A  at  a-levels 

of  0.05  "and  0.10.   The  results  show  that   the   variable(s) 

in  reduced  models  have  significant  influence  (i.e.,  cannot 


Table  5.23 


Overall  F-tests  for  Disaggregate  Analysis 


i47 


Highway 
Category 

ATR 
Station 

Variable 

Subscripts  for 

Full  Model 

flffi  .  d»F 

(*) 

* 

F 

a 

IsH, 

true  for 

Ot     r     .05"> 

o.-    .ic? 

Rural 

172fl 

D 

1,  11 

26.476 

<.001 

ve  j 

Yes 

Interstate 

307  OA 

2.       6 

2..    10 

61.951 

<  .001 

Yes 

Yes 

547  4n 

4 

1.    11 

$0,320 

<:.001 

Yes 

Yes 

Rural 

6Sfi 

7 

1,    11 

234.112 

<  .001 

Yes 

Yes 

Principal 

134A 

2     7 

2,    10 

25.611 

<  .001 

Yes 

Yes 

Arterial 

173A 

2,    10 

222.058 

< .  001 

Yes 

Yes 

£546 

1,    11 

17.015 

.001-. 005 

Yes 

Yes 

Rural 

25A 

'  .■     t 

2..    10 

32.489 

<  .001 

Yes 

Yes 

279R 

2,    6 

2,    10 

15.811 

< .  001 

Yes 

Yes 

301A 

1         9 

2.    10 

21.530 

<  .001 

Yes 

Yes 

Miner 

31 9A 

1,    11 

30.311 

<:.001 

Yes 

Yes 

4£fl 

1,    9 

9.965 

.01-. 025 

Yes 

Y'es 

100X 

1,    10 

22.876 

<.001 

Yes 

Yes 

Arterial 

25  6h 

1,    11 

16.899 

.001-. 005 

Yes 

Yes 

2S2A 

■1         9 

2,    10 

5.309 

.025-. 05 

Yes 

Yes 

Rural 

59R 

1,    11 

28.563 

<.001 

Yes 

Yes 

Major 

2  OCX 

1.    8 

13.940 

.005-. 01 

Yes 

Yes 

Collector 

542  OR 

1,    11 

23.447 

<  .001 

Yes 

Yes 



7047R 

4 

1,    11 

11.986 

.001-. 005 

Yes 

Yes 

(*)  Of      =  oegrees  of  freedom  for  Regression, 
ft 


df 


_  b  degrees  of  freedom  for  Error. 
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Table  5.24 


Partial  F-tests  for  Disaggregate  Analysis 


Higto  ay 
C  a  t  eg  o  ry 

RTR 

Station 

Yariacie     Subscripts 
for 

Of    .  <Jfc 
p         F 

F* 

a 

IS   M,       U*   FOT 

a  =  .os> 

cc=  -10? 

Full 

Model 

Recuce: 
Model 

Rural 

Interstate 

3070A 

2,     8 

2 

g 

11,    10 
11,    10 

104.921 

3o.eer 

c.001 

<  .001 

Yes 

Yes 

Ves 
Yes 

Rural 

Principal 

Arterial 

134R 
175h 

I.   7 

■4            *> 

c 

1 

e 

11,    10 
11.    10 

11,    10 

11,    10 

51.220 
3C.892 

15.955 
221.394 

<  .001 
(.001 

.001-. 005 
(.001 

Ye  5 
Ye  j 

Ye? 
Yes 

yes 

Yes 

Yes 
Yes 

Rural 
Minor 
Arterial 

25h 

279A 
301P 
262fl 

1,  2 

2,  6 
1,    2 
1.    2 

1 

*> 

t 

2 

6 

1 

c 

1 

11     10 

11,    10 

11,    10 

11,    10 

11      10 

11,    10 

11,    10 
11,    10 

12.528 

55.149 

12.740 
26.579 

10.239 
35.547 

10.404 

6.14: 

.005-. 01 
<.001 

.005-. 01 
<:.001 

.005-. 01 

<  .001 

.005-. 01 
.01-. 025 

Yes 

Ye.: 

Yes 
Yes 

Yes 
Yes 

Yes 
Ye; 

Yes 

Yes 

Ye; 
Yes 

ves 
Yes 

Yes 

Yes 

a 

fR  =  degr 
P-   -  flegr 

ees  of  freedo 
ees  of  freedo 

n  for  SSE  of 
i  for  SSE  of 

Reduced 
Full  nod 

lodel. 
el. 
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be  rejected)  at  5  percent  level  of  significance. 


5. A. 4  Model  Development  and  Performance 


The  final  regression  equations  are  presented  in  Table 

2 
5.25,    along    with   R    values,   overall   F   values,   t- 

Btatistics,   and   elasticities.    The   equations   for   the 

stations    under    rural    interstates,    rural   principal 

arterials,   rural   minor    arterials    and    rural    major 

collectors   explain   70.6  -  92.5,  60.7  -  95.5,  51.5  -  86.7 

and  52.1  -  72.2  percent  variation  in   AADT,   respectively, 

by   the   use   of  the  associated  X-var iable ( s ) .   Not  all  of 

the  goals  of  Table   5.7   have   been   met   in   all   of   the 

equations   in   Table   5.25.    However,   the  equations  that 

resulted  from  the  goals  specified  in   Table   5.7   are   the 

best  possible,  considering  all  the  limitations.   Using  the 

elasticities  obtained   from   the   regression   analysis,   a 

forecasting   model   was   developed   for   each   station   by 

substituting  those  elasticities  into  equation  3.1  (Chapter 

3).    These   models   are  presented  in  Table  5.26.   Each  of 

the  models  is  simple,  with  not  more  than  two  variables   in 

any    case.     The    use    of    these    models    is    also 

straightforward.   The  data  needed  to  predict  rural  traffic 

volumes   with   these   models   are  readily  available  at  the 

county,  state  and  national   levels.    The   models   can   be 

implemented  with  a  hand-held  calculator. 
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Table  5.25 
Final  Regression  Equations  from  Disaggregate  Analysis  (*) 


Mural  InWrstet* 

Station  172A: 

AAIiT   =  -74246.66  •  0.0171  State  Population 

R    =  0.706                              t  =  5.146 
F   -  26.478                                  t   -  5.24231 

Station  3070A: 

W»T  :  -105260.90  -  166.311  US  Gas  Price  •  0.0241 
R2    -  0.925                                t  -  -5.539 

F  =  61.951                                  e  =  -0.44503 

State  Population 
t  =  10.243 
a  =  7.74428 

Station  5474A: 

AAOT   =   -34973.79   •   1.1669  County  Population 
r2    =  0.880                              t  =  6.%2 
F   =  80.320                                 e  =   6.16172 

Rural  Principal  Arterial 

1 
1 

1 

Station 

6«A' 

AAOT  r  924.99  • 
„2  =  0.955 

F   =  234.112 

0.0016  State  vehicle  Registrations 
t  =  15.301 
e  =  0.86979 

Station 

134A: 

AAOT   =   7120.83 
r2:  0.837 
F  «  25.611 

•  116.511  US  Cas  Price  •  0.0026  State  Vehicle  Registrations 
t  -  -5.558                        t  =  7.157 
•  >  -0.43949                       •   »  0.83878 

Station 

173A: 

WW  '  -2870.28 
R2*  0.976 
F  =  222.058 

♦  0.4903  County  Vehicle  Registrations  -  52.2181  US  Gas  Price 
t  =  14.879                                                    t  '  -3.W4 
e  «  1.47643                                                        e  =  -0.21371 

Station 

2546: 

AADT    :   2990.64 
R2  =   0.607 

F   =   17.013 

•  0.1495  county  vehicle  Registrations 

t   =   4.125 
e   =   0.603OO 

Rural  ftinor  ArUrial 

Station 

25  A: 

AAOT  =   1492.57 

r2«  0.867 
F  =  32.489 

>  0.1286  County  Vehicle  Registrations  -  24.9935  US  Cas  Price 

t  =  7.426                                                            t  =  -3.540 
e  =  0.90147                                                     e  =  -0.29365 

Station 

279fi: 

AAOT   ■  4892.50 
R2«  0.760 
F  =  15.811 

■  28.5409  US  Cas  Price  •  0.0186  County  Enploynent 
t  *  -5.156                       t  «  3.569 
e  *  -C. 26635                  e  =  0.24526 

Station 

301A: 

AAOT  =  2236.63 
r2  -  0.812 
F   •  21.530 

•  0.1423  County  Vehicle  Registrations  -  22.2131  US  Cas  Price 
l  =  5*2                                                         1,-3.200 

•  ■  °  "7J1                                                     e  =  -0.26576 
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Table     5.25    (continued) 


Rural   rimer   Arterial 

Station 

5196: 

RRD1 
fi2  = 

F  = 

=  752.47  4  0.0471  County  Vehicle  Reqi strati ons 
0.734                           t  =  5. -506 

SO.  311                            e  =  0.61456 

Station 

42ft: 

fiftDT  =  2147.18  +  0.0279  Count}/  Vehicle  Registration: 

R2  =   0.525                               t  =   3.157 
F  =  9. 965                                 e  =   0.49557 

Station 

100X: 

fifir-T  =  3045.80  +  0.0168  County  vehicle  Registrations 
p    =  n  '■•""'■                           t  =  4.783 

F  =   22.676                               e  =   0.64675 

Station 

256R: 

rrdt  =  1S61.5I  +  0.0819  County  Vehicle  Registrations 

R  2    =  0.632                             t  =   4.047 
F  =  18.899                              e  =   0.33059 

.  .   _  .  — | 

Station 

26  2ft: 

RRDT  =  2571.90  +  0.0356  County  Vehicle  Registrations  -  12.9828 
R^=  0.515                              t  =   2. 855 
F  =  5.309                              e  =  0.28256 

US  Gar  Price     I 
i 
t  =  -3.226 

e  =  -0.23256 

Reral  Major  collector 

Station 

59P: 

RRDT  =  2772.53  +  0.0481  County  Vehicle  Registrations 
r2  =  0.722                               t  =  5.344 

F  =  28.563                              e  =  0.36063 

station 

2  OCX: 

RhDT  r  6557.79  +  0.0503  County  Vehicle  Registrations 
p.2=   0.635                               t  =  3.73d 

F  =  13.940                               e  =  0.26407 

Station 

5420ft: 

RRDT  =  764.34  +  0.1163  County  Enploywent 
R2=  0.661                            t  =  4.842 
F  =  23.447                             e  =    .59744 

Station 

7047ft : 

RRDT  =  1121.06  -  0.0455  County  Population 

R*  =  0.521                            t  =  -3.462 
F  =  11.966                               e  =  -3.46274 

(*)  For  unit  ano4  synbol   of  each  variaMe,   see  TaMe  4.1  of  Chapter  4. 
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Table  5.26 
Disaggregate  Traffic  Forecasting  Models  (*) 


Rural    interstate 

Station 

17ZR: 

RPD'^  = 

RflDTr  [  1  +  5.24231  (a  state  Population)] 

Station 

.;,,"  70-: 

RflDT  f  = 

RRDTp  [  1  -  0.44503  U  US  gas  Price)  +  7.74428  (i  State  Population)] 

Station 

?4~4P: 

RflDT  ,  = 

RflDT  p.  [  1  +  6.18172  (A  County  Population)] 

Rural   Principal   Arterial 

Station 

68fl : 

flfi£T  f  = 

flflDTp  [  1  +  0.88979  (A  State  Vehicle  Registrations)] 

Station 

134R: 

RflDT  *  - 

RflDTp  [  1  -  0.4?94'j  (A  us  gas  Price)  +  0.8387S  (4  State  - 

Vehicle  Registrations)] 

Station 

173fl: 

RflDT  f  = 

flflDTp  [  1  +  1.47643  (A  County  Vehicle  Registrations)  -  0.21371  (4  us  - 

Gar  Price)] 

Station 

1S4E  : 

RflDT  .  = 

flflDTp  [  1  +  0.60300  (A  County  Vehicle  Registrations)] 

Rural   Miner  ftrterial 

Station 

25fl : 

RflDT  f  = 

RflDTp  t  1  +  c.9014?  (A  County  Vehicle  Registrations)  -  0.29365  (A  US  - 

Gas  Price)] 

Statior 

2?9fl: 

RflDT  f  = 

RflDTp  [  1  -  0. 26635  (4  US   Qa5  Price)  +  0.24526  (4  County  Employment)] 

Station 

301fl: 

RflDT  f  = 

flflDTp  [  1  +  0.66731  (4  County  Vera  tie  Registrations)  -  0.26576  (4  US  - 

Gar  Price)] 

Station 

31 9A  : 

RflDT  f  = 

flflDTp  [  1  +  0.61456  (4  County  Vehicle  Registrations)] 
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Table  5.26  (continued) 


Kuril    Minor  arterial 

Station  4£fl: 

flflDT  f  = 

flflDT P  [  1  +  0.49637  (i  County  Vehicle  Registrations)] 

St  at  i  or,  100X: 

RflDT  i  = 

I 
i 

RflDT p    [  1  +  0.64675  (i,  County  Vehicle  Registrations)] 

station  25&R: 

RflDT  .  = 

flflDT p  [  1  +  0.33059  (A  County  Vehicle  Registrations)] 

Station  262R: 

RflDT  f  = 

fiflDTp  [  1  +  0.28236  (4  County  Vehicle  Registrations.)  -  0.23256  i..l  US  - 

Gas  Price)] 

Rural   Major  Collector 

Station  59R: 

RflDT  f  = 

flflDT p  [  1  +  0.36063  (A  County  Vehicle  Registrations)] 

Station  200X: 

flADT  f  = 

flflDT p  [  1  +  0.28407  (A  County  Vehicle  Registrations)] 

Station  54;0fl: 

RflDT  f  = 

ARDTp  [  i  +  0.59744  (i  County  Enpioynent)] 

Station  ?0d7fi : 

RflDT  f  = 

flflDTp   [  1  -  3.43274  (A  County  Population)] 

(*)  (i)  For  unit  and  symbol  of  each  variable,  see  Table  4.1  of  Chapter  4. 

(ii)  A  represents  change  in  predictor  variable  with  respect  to  its  present  value  in  fraction. 

X   -  X 
For  exaFiole  AX  =    -    p  '  *here  x  p  6  x  f  represents  present  and  future  value  of  X. 

Xp 
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The  ability  of  the  models  shown  in  Table  5.26  to 
predict  1983  and  1984  traffic  volumes  was  tested  using 
1980  as  the  "present  year".  The  1983  and  1984  data  were 
not  used  in  the  development  of  the  model,  but  now  can  be 
used  to  allow  a  comparison  of  the  accuracy  of  the 
disaggregate  model  with  extrapolation.  The  results  of 
this  comparison  are  shown  in  Table  5.27.  Table  5.27  also 
shows  AADT  for  years  1983  and  1984  obtained  from  simple 
extrapolation.  Figures  Fl  to  F4,  selected  randomly  from 
the  19  figures  in  Appendix  F,  illustrate  how  this 
extrapolation  is  carried  out.  In  these  figures,  an 
average  line  is  drawn  for  each  plot  through  the  data 
points  and  is  then  extrapolated  to  1984.  This  simple 
extrapolation  is  a  very  crude  method.  But,  Table  5.16 
shows  that  simple  extrapolation  often  gives  better  results 
over  the  short-range  with  aggregate  models.  This  simple 
extrapolation  will  not  likely  provide  good  results  over 
longer  ranges  (more  than  10  years).  While  the  proposed 
model  is  expected  to  provide  better  results  because  it  is 
based  on  the  functional  relationship  between  the  response 
variable  (AADT  in  this  case)  and  predictor  variable(s). 


The  disaggregate  model's  forecasts  come  closer  to  the 
actual  "future  values"  than  the  extrapolations  in  a 
majority  of  the  cases.  The  prediction  errors  for  either 
method  are  not  more  than  15  percent.  In  general,  both  the 
simple  extrapolation  and  the  disaggregate   models   provide 
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Table  5.27 


Performance  of  Disaggregate  Traffic  Forecasting  Models 


Highway 

Station 

Year 

Rctual 

Predicted 

Prediction 

Extrapolated 

Extrapolation 

Category 

flflDT 

fiflDT 

Error  in 
percent  (*) 

flflDT 

Error  in 
percent  0) 

172A 

1983 

18454 

18454 

.49 

20750 

12.44 

Rural 

1984 

19091 

18885 

-1.0* 

21000 

10.00 

307  Oft 

1983 

18219 

19171 

5.22 

1980G 

6.68 

Interstate 

54";4fi 

1983 

7047 

7161 

1.62 

8000 

13.52 

1984 

7S41 

7281 

-3.45 

8050 

6.75 

68fl 

1983 

7989 

7642 

-4.10 

8100 

1.64 

Rural 

1984 

8105 

7816 

-3.57 

8200 

1.1" 

134H 

1983 

12366 

12765 

3.23 

13200 

6.74 

Principal 

1  73h 

1983 

12751 

12087 

-5.21 

12900 

1.17 

firterial 

25<E 

1983 

9031 

8086 

-10.46 

8800 

-2.56 

1984 

9661 

8244 

-14.87 

8950 

-7.36 

25fi 

1983 

4245 

4136 

-2.57 

4320 

1.77 

Rural 

279h 

1983 

4762 

5144 

8.02 

5020 

5.42 

Minor 

3C1A 

1983 

3793 

4238 

11.73 

4140 

9.15 

Arterial 

319R 

1983 

2211 

2201 

-.43 

2420 

9.45 

1964 

2279 

2236 

-1.89 

2460 

7.94 
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Table    5.27     (continued) 


Highway 

Category 

Station 

Year 

Pctuai 
ftflDT 

Predicted 
ftADT 

Prediction 

Error  in 

percent  (*) 

Extrapolated 
RfiDT 

Extrapolation 

Error  in 

percent  (*) 

Rural 

4<h 

1963 
1964 

4515 
4607 

4353 

4411 

-3. £9 
-4.25 

4675 

4710 

3.54 
2.24 

100X 

1983 
1984 

9103 

9580 

6608 
8643 

-5.44 
-9.59 

9420 
9540 

3.48 

-.21 

Minor 

2FM 

196" 
1984 

2861 
£543 

2839 
£671 

-.04 

3019 
30£0 

5.21 
2   6£ 

Arterial 

262* 

1980 

2488 

2617 

5.18 

2620 

5.31 

Rural 

59R 

1983 

1984 

4551 
4769 

4701 
4718 

3.30 
-1.07 

4780 

4600 

5.03 

1.73 

200) 

1983 

1964 

9197 
9S5G 

9471 
9565 

1.87 
-3.84 

9700 
9760 

4.33 

-1.91 

tl  3 ;  c  r 

scon 

1983 

1979 

£360 

6.97 

£340 

18.24 

Collector 

7047R 

1  gf  j 
1984 

281 
£73 

J  ■            . 

262 
£62 

-6.76 

-4.03 

302 

310 

_  . 

7.47 
13.55 

(*)     '+'  sign  indicates    overprediction    and 
•-•  sign  indicates    underprediction. 
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comparable  forecast  errors  in  this  short  range  of  time. 
But,  it  is  expected  that  the  disaggregate  model  will 
provide  increasingly  better  traffic  forecasts  than 
extrapolation  as  the  planning  horizon  increases,  while  the 
projections  from  extrapolation  will  lose  accuracy.  While 
this  short  range  comparison  between  disaggregate  models 
and  extrapolation  is  inconclusive,  there  is  an  indirect 
indication  that  disaggregate  models  perform  better  over 
this  time  span  than  aggregate  models.  This  compatible 
with  the  comparative  statistical  measures  obtained  during 
the  development  and  refinment  of  both  model  times.  In 
general,  with  a  lower  number  of  variables,  the 
disaggregate  models  yielded  lower  prediction  error  than 
the  aggregate  models  (See  Tables  5.16  and  5.27). 
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CHAPTER  6 


SUMMARY  AND  CONCLUSIONS 


Both  aggregate  and  disaggregate  traffic  forecasting 
models  models  for  rural  state  highways  in  Indiana  were 
developed  using  traffic  data  from  Automatic  Traffic  Record 
(ATR)  stations  and  economic  and  demographic  variables  for 
the  county,  state  and  national  levels.  The  models  and  the 
described  procedure  are  intended  to  provide  highway 
planners  with  a  tool  for  simple,  fast  and  inexpensive 
estimation  of  traffic  projections.  Some  problems  and 
limitations  of  the  models  and  suggestions  to  overcome  the 
problems  have  been  discussed.  This  chapter  presents  the 
steps  to  implement  the  models  and  makes  recommendations 
for  further  studies. 

6.1  Guidelines  for  Applicability  of  Models 


Preliminary  statistical  analysis  (Chapter  5)   favored 
the   disaggregate  model  applied  to  each  station  separately 
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over  the  aggregate  model  applied  to  each  highway  class. 
The  disaggregate  models  are  location-specific,  but  the 
aggregate  models  are  general  in  nature  for  a  particular 
highway  category.  However,  the  use  of  disaggregate  models 
is  not  limited  only  to  the  locations  for  which  they  are 
developed.  If  a  project  site,  for  which  a  forecast  of 
future  traffic  is  needed,  can  be  shown  to  be  "similar"  to 
a  station  for  which  a  disaggregate  model  has  been  built, 
then  the  disaggregate  model  of  the  station  could  be 
employed.  The  following  points  are  provided  as  a  guide 
deciding  whether  a  section  of  highway  is  "similar"  to  a 
station  for  which  a  disaggregate  model  has  been  developed: 

1.  The  statistical  test  for  equality  of  two  population 
means  could  be  carried  out  for  the  response  (Y)  and 
predictor  variables  (X's)  at  the  county  level  to  see 
if  the  mean  of  these  variables  are  the  same  for  the 
two  points  or  section  of  highway  under  consideration. 
The  hypothesis  and  the  decision  rule  for  this  test 
are  explained  in  Appendix  G. 

2.  The  stage  of  commercial  and  industrial  land 
development,  measured  as  a  percentage  of  commercial 
and  industrial  land  to  the  total  land,  of  the  two 
counties  should  be  approximately  similar. 


3.   The  highway   type,   it6   geographical   location   with 
respect   to  traffic  generators  (for  example,  schools, 
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hospitals,  restaurants,  shopping  centers,  etc.),  and 
road  network  characteristics  of  the  two  points  should 
be  similar. 

The  aggregate  model  is  general  in  nature  for  a  particular 
highway  category.  The  aggregate  model  for  a  category  of 
highway  is  designed  to  be  applicable  for  any  section  under 
that  category  of  highway,  although  it  is  usually  not  as 
reliable  as  the  disaggregate  model.  If  a  project  site  can 
not  be  shown  to  be  "similar"  to  a  station,  then  the 
aggregate  model  should  be  applied  to  that  site. 

6.2  Summary  of  Aggregate  Models 


Elasticity-based  aggregate  traffic  forecasting  models 
were   presented   in   Table   5.15.   Each  of  these  models  is 

simple  and  does  not   contain   more   than   three   predictor 

2 
variables   (X's)   in   it.    The   models  have  good  R   (65.8 

percent   to   83.7   percent)   values.    These   models    are 

statistically   sound   and   simple,  with  only  one  predictor 

variable  in  three  cases  and  two  predictor  variables  in  the 

other   case.    The   results   of   the   performance  of  these 

models  were  presented  in  Table  5.16  for  the   stations   not 

used   in   model   development.    The   forecasted  errors  are 

reasonably  small  in  most  of  the  cases  and  speak.   well   for 

the   reliability   of   the  models.   The  choice  of  predictor 

variables  for  the  models  was  based  on  the   combination   of 

statistical    analysis    and    subjective   judgment.    The 
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predictor  variables  used  in  the  models  were  found 
significant  at  the  5  percent  level  of  significance.  (See 
Tables  5.12  and  5.13.)  The  resulting  models  were  found  to 
be  satisfactory  within  the  limitation  of  data  available. 

6.3  Summary  of  Disaggregate  Model s 


Elasticity-based  disaggregate  models   were   presented 
in   Table  5.26.   Each  of  the  models  is  simple  and  does  not 

contain  more  than  two   predictor   variables.    The   models 

2 
have   good  R   values  —  51.5  percent  to  97.8  percent.   The 

results  of  the  performance  of   the   models   (presented   in 

Table   5.27)  showed  that  the  predict  ion/ for ecas ting  errors 

in  88  percent  of  the  cases  were  found  to  be   equal   to   or 

less   than  10  percent.   The  larger  prediction  errors  (more 

than  10  percent)  in  the  rest   of   the   cases   are   due   to 

insufficient   data.   The  choice  of  predictor  variables  for 

the  models  was   based   on   a   combination   of   statistical 

analysis  and  subjective  judgment,  as  described  in  Sections 

5.3.2.1  to  5.3.3.1.   The  predictor  variables  used   in   the 

models   were   found  significant  at  as  low  as  the  5  percent 

level  of  significance  (see   Tables   5.23   and   5.24).   The 

disaggregate   models   were  found  to  be  satisfactory  within 

the  limitation  of   data   and   better   than   the   aggregate 

models   with   respect   to  performance  and  graphic  residual 

analysis. 
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6.4  Problems,  Limitations  and  Suggestions 

A  few  problems  may  appear  as  soon  as  users  begin  to 
use  the  models  to  predict  rural  traffic.  The  most  serious 
problem  in  the  application  of  this  procedure  is  one  that 
is  common  to  all  forecasting  processes:  the  accuracy  of 
the  model  is  determined  to  a  large  extent  by  the  accuracy 
of  the  input,  especially  the  future  values  of  the 
predictor  variables  (X's).  In  this  study,  the  following 
predictor  variables  were  used  in  disaggregate  models:  (1) 
population,  (2)  households,  (3)  vehicle  registrations,  (4) 
employment  and  (5)  gas  price.  On  the  other  hand, 
aggregate  models  were  developed  using  only  (1)  population 
and  (2)  households.  The  Indiana  University  Business 
School  [26]  projects  the  population  and  number  of 
households  for  every  fifth  year  into  the  future,  but  there 
is  very  little  information  available  for  the  other 
variables  required  by  the  disaggregate  models.  The 
question  then  is  how  to  estimate  future  values  for  vehicle 
registrations,  employment  and  gas  price. 


Several  options  could  be  suggested  to  obtain  future 
estimates  of  vehicle  registrations.  The  first,  and  most 
appropriate,  is  to  check  the  Bureau  of  Motor  Vehicles  to 
see  if  they  have  forecasts  appropriate  for  our  model.  If 
that  fails,  then  the  following  methods  [38]  could  be 
employed  to  forecast  future  vehicle  registrations: 


163 

1.  Calculate  the  average  annual  growth  rate  from  the 
historical  data  (say  1970  to  1982  data,  which  were 
used  in  data  tables),  and  assume  an  increasing, 
decreasing,  or  constant  rate  for  the  future.  This 
method  does  not  consider  reaching  a  saturation  level 
of  vehicle  ownership,  but  it  may  be  reflected  by 
altering  the  projected  growth  rate. 

2.  The  saturation  phenomenon  that  could  be  employed  to 
estimate  future  vehicle  registrations  is  the  only 
difference  in  this  method  from  the  first  method, 
described  above.  Examine  the  trend  of  vehicles  per 
person  in  the  previous  years  and  then  carry  that 
trend  out  to  the  future  until  the  value  reaches  a 
pre-defined  saturation  level.  For  example,  the  trend 
of  vehicle  per  person  for  the  State  of  Indiana  is 
shown  in  Figure  6.1.  From  this  figure,  0.85  could  be 
taken  as  the  saturation  level  of  vehicles  per  person 
for  the  State  of  Indiana.  Then,  by  multiplying  the 
projected  number  of  vehicles  per  person  in  a  future 
year  by  the  population  forecast,  an  estimate  of  that 
year's  vehicle  registrations  can  be  obtained. 


The  ways  to  obtain  the  future  values  for  employment 
are  similar  to  those  of  vehicle  registrations.  First,  and 
most  appropriate,  is  to  contact  the  Employment  Security 
Division   [24]   for   employment  forecasts.   If  that  fails, 


164 


VEHICLE/PERSON 


1.00 
0.90 
0.80 
0.70 
0.60 
0.50 
0.40 


0.30 


iBDpB 

: 

3° 

: 

^^^ 

n., 

YEAR 


1940 


1950 


1960 


1970 


1980 


1990 


Figure    6.1:     Trend    of    Vehicle/ Person ,     State    of    Indiana 
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then  the  easiest  way  is  to  calculate  the  average  annual 
growth  rate  from  historical  data  (1970  to  1982  data  were 
used  in  data  tables),  and  assume  an  increasing, 
decreasing,  or  constant  rate  for  the  future.  Also,  the 
local  employment  office  may  be  able  to  provide  some 
information  about  the  future  levels  of  employment.  A 
second  method  is  to  calculate  employment  per  person  in  the 
previous  years.  In  this  method,  the  saturation  level,  if 
any,  could  be  employed.  The  trend  is  carried  out  to 
future.  Then,  by  multiplying  this  trend  by  estimated 
population  in  future  years,  employment  data  could  be 
developed  . 


To  get  future  values  of  US  gas  price,  a  first  step 
will  be  to  check  whether  the  Independent  Petroleum 
Association  of  America  and/or  US  Department  of  Energy  have 
useable  forecasts.  If  no  outside  fuel  price  forecasts  are 
available,  the  user  still  has  recourses.  He  can  devise  a 
series  of  simple  fuel  prices  projections  (by 
extrapolation,  etc.)  to  produce  a  range  of  values  that  can 
represent  high,  medium,  and  low  fuel  price  scenarios.  The 
results  of  these  values  used  in  traffic  forecasting  models 
can  then  be  compared  with  the  results  of  models  that  do 
not  require  fuel  price  as  an  input  variable,  if  such 
models  exist.  At  a  minimum,  the  traffic  forecasts  based 
on  the  range  of  fuel  price  values  could  be  compared 
against   a   range   of   traffic   volume   extrapolations,  in 
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search  of  some  degree  of  consensus. 

Applicability  of  the  models  in  various  areas  may  also 
cause  problems.  How  can  a  user  decide  whether  the  project 
area,  for  which  future  forecast  of  traffic  is  needed,  is 
"rural"  enough  for  the  model(s)?  It  is  difficult  to 
provide  guidelines  to  assist  in  this  issue,  but 
suggestions  [ A 9 ]  exist.  Judgment  is  required  in  making 
this  determination.  In  very  approximate  terms,  highways 
with  more  than  10  uncontrolled  access  points  per  mile  (on 
one  side)  would  be  considered  to  be  "suburban".  Also,  any 
highway  on  which  left  or  right  turns  cause  appreciable 
delay  to  through  vehicles  would  also  be  classified  as 
"suburban".  Multilane  suburban  highways  and  rural  roads 
differ  from  suburban  arterials  in  the  following  features: 
(1)  their  roadside  development  is  not  as  intense,  (2)  the 
density  of  traffic  access  points  is  not  as  high,  and  (3) 
signalized  intersections  are  more  than  2  miles  apart.  In 
fact,  highways  with  signal  spacing  of  2  miles  or  more 
could  be  treated  as  "rural"  highways.  Increased  use  of 
the  developed  models  will  lessen  this  problem. 


The  model  formulation  in  Chapter  3  assumes  that 
elasticities  are  constant  over  time.  Historically,  travel 
has  been  growing  at  a  fairly  constant  rate  for  many  years. 
Although  fuel  shortages  interrupted  this  increasing  rate 
for  a  while,  it  has  resumed.  Therefore,  any  assumption  of 
constant   elasticities   would   not   introduce   substantial 
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errors.  On  the  other  hand,  variable  elasticities  are  not 
very  common  in  traffic  forecasting,  which  involve  more 
sophisticated  and  expensive  analysis  [28].  The 
sophisticated  and  expensive  analysis  is  against  the 
principles  that  the  models  should  be  easy  to  understand 
and  less  costly.  But,  when  new  census  data  become 
available,  the  elasticities  could  be  recalculated  and  the 
appropriateness  of  earlier  values  could  be  checked.  If 
the  elasticities  seem  to  change  significantly  (for 
example,  more  than  10  percent),  then  the  new  set  of 
elasticities  should  be  used  in  the  model. 

Users   are   expected   to    weigh    the    results  of 

forecasting   models   in   terms  of  the  local  situation,  and 

adjust  them  according  to  their  professional  judgment  of 
the  specific  area. 

6.5  Stepwise  Plan  for  Implementation 

The  steps  that  are  recommended  for  the  implementation 

of   the   aggregate  and  disaggregate  models  to  predict  the 

future  traffic  for  rural  roads  of  Indiana  are  listed 
be  low . 

1.   Determine  the  exact  location  (i.e.,   county)   of   the 
roadway  for  which  forecast  is  needed. 


2.   Select  the  traffic  model(s)   that   will   be   used   to 
predict  traffic. 
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a.  Determine  the  functional  class  of  roadway. 

This  will  determine  which  aggregate  model 
is  applicable  to  the  project  site.  To  determine 
the  functional  class,  the  functional 
classification  system  map,  prepared  by  division 
of  planning  of  Indiana  State  Highway  Commission, 
will  be  the  best  guide.  Moreover,  the 
definitions  provided  in  Section  2.8  of  this 
report  would  be  helpful  to  find  the  appropriate 
highway  class.  The  project  site  will  be 
classified  in  one  of  the  four  categories  of 
highways  provided  in  Table  4.2.  If  the 
classification  is  not  clear-cut,  then  personal 
judgment  should  be  used,  and  documentation 
provided  . 

b.  Examine  the  project  site   with   respect   to   ATR 
stations  . 


Check  if  the  project  site  is  one  of  the 
Automatic  Traffic  Record  (ATR)  stations,  used  in 
the  development  of  models.  If  it  is  one  of  the 
stations  used  in  the  model  development,  then  the 
disaggregate  model  for  that  station  will  be 
applicable.  Otherwise,  the  procedures  described 
in  Section  6.1  could  be  used  to  classify 
determine   if   the   project  site  is  "similar"  to 
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one  of  the  stations  used  in  the  model 
development.  In  case  the  project  site  is  not 
found  to  be  "similar"  to  a  ATR  station,  the 
aggregate  model  for  a  highway  category  must  be 
applied  to  the  project  site.  Identification  of 
the  highway  category  of  the  project  site  is  the 
only  criterion  used  to  select  the  appropriate 
aggregate  model. 

3.  Collect  the  base  year  AADT. 

The  base  year  AADT  of  the  project  site  can  be 
one  value  for  a  small  project  (e.g.,  intersections), 
or  a  series  of  estimates  for  roadway  sections  for  a 
larger  project  (e.g.,  lane  widening).  One  possible 
source  of  data  would  be  the  Highway  Department's 
Traffic  Volume  book.  If  the  Traffic  Volume  Book 
fails  to  provide  such  information,  then  it  could  be 
determined  from  short-term  counts  at  the  project 
site,  using  the  procedure  described  in  Section  2.6. 

4.  Collect   the   base   and   future   year   data   for   the 
predictor  variables. 


The  description  of  variables  in  Section  4.2  is  a 
guide  to  the  sources  of  the  required  predictor 
variables.  Section  6.4  will  also  be  helpful, 
particularly  with  reference  to  future  year  data  for 
the  required  predictor  variables. 
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5.   Estimate  the  future  year  AADT . 

a.  Calculate  the  future  year  AADT  by  using  the 
appropriate  aggregate  model  (Table  5.15),  as 
determined  in  step  2(a),  with  the  values  found 
in   steps   3  and  4.  Denote  this  AADT  estimate  as 

AADT  . 

a 

b.  Calculate   the   future   year    AADT    by    using 

appropriate   disaggregate  model  (Table  5.26),  as 

determined  in  step  2(b),  if  possible,   with   the 

values   found  in  steps  3  and  4.  Denote  this  AADT 

estimate  as  AADT,. 

d 

c.  Find  the  weighted  average  of  the  two  AADT 
estimates  found  in  steps  5(a)  and  5(b).  The 
users  may  give  more  weight  to  the  AADT  found  in 
step  5(b),  because  it  was  found  that  the 
disaggregate  model  performs  better  than  the 
aggregate  model.  The  weighted  average  of  AADT  is 
calculated  by  using  equation  6.1. 
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AADT   =  w  *  AADT   +  (1  -  w)  *  AADT.,         (6.1) 
w  a  d 


where, 

w  =  Weight  given  to  AADT  estimate  done  by 

aggregate  model,  0  <  w  <  1.00, 

AADT   =  AADT  estimate  by  aggregate  model, 
a 

AADT,  ■  AADT  estimate  by  disaggregate  model, 
d 

AADT   =  weighted  AADT  estimate, 
w 

In  general,  given  the  better  performance  of 
disaggregate  models  with  respect  to  aggregate 
models,  the  value  of  w  is  recommended  to  be  less 
than  0.50  (users  are  suggested  to  use  a  value  of 
w  between  0.35  to  0.45).  If  an  AADT  estimate 
using  disaggregate  model  (step  5b)  is  not 
possible,  then  the  value  of  w  must  be  1. 

6.   Adjust  the  estimated  future  year  AADT. 


If  historical  AADT  counts  are  available  for  the 
project  site,  plot  AADT  against  time  and  extend  the 
trend  to  the  future  year.  Check.  whether  the 
projected  AADT  differs  significantly  (say,  more  than 
25  percent)  from  the  AADT  estimate  found  at  step  5. 
In  case  of  a  significant  difference,  an  average  of 
the  estimate  at  step  5  and  the  extension  of  plot  of 
AADT  against  time  at  the  desired  year  may  be  taken  as 
the  "future   year   AADT".    Otherwise,   the   estimate 
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result  after  step  5  will  be  the  "future  year  AADT" . 


6.6  Recommendations  for  Future  Study 


The  methodology  presented  in  this  report  was  based  on 
a  small  number  of  continuous  count  stations.  The 
aggregate  and  disaggregate  traffic  forecasting  models  for 
rural  roads  of  Indiana  were  developed  using  this 
methodology.  Continuous  count  stations  are  the  only 
locations  where  "true"  historic  AADT  counts  are  available. 
Further  traffic  forecasting  studies  will  be  helped  by  the 
installation  of  more  continuous  count  stations  at 
locations  representing  a  variety  of  highway  categories  and 
traffic  characteristics.  It  is  expected  that,  with  an 
increased  number  of  continuous  count  stations,  the  present 
methodology  will  provide  better  statistical  results  and 
model  performance.  Moreover,  with  an  increased  number  of 
count  stations,  it  may  become  possible  to  divide  the  whole 
rural  state  network  into  regions  or  otherwise  separate 
different  historical  growth  rates.  Statistical  methods 
could  be  employed  to  identify  the  different  sectors  or 
groupings.  In  the  early  stage  of  this  study,  this 
approach  was  attempted,  but  dropped  due  to  the  limited 
number  of  ATR  stations.  The  development  of  a  model  for 
each  sector  would  be  similar  to  aggregate  and  disaggregate 
models  developed  in  this  report. 
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Time  series  analysis  could  be  used  to  forecast  future 
traffic.  According  to  Armstrong  [A, 5],  the  time  series 
approach  could  be  combined  with  the  present  approaches  to 
obtain  reliable  traffic  forecast.  Time  series  analysis 
treats  traffic  volume  as  a  function  of  time  and  uses  land 
use  development  as  the  starting  point  to  formulate  the 
traffic  growth  —  as  time  passes,  more  land  is  developed 
and  traffic  increases  proportionally.  Time  series 
analysis  is  also  a  way  to  introduce  time  lags,  especially 
with  respect  to  economic  predictor  variables,  to  see  if 
better  ADT  forecasting  models  are  possible. 

The  variables  used  in  statistical  analysis  are  more 
or  less  subject  to  error.  The  prediction  in  this  case 
could  be  further  modified  by  introducing  an  error  term  in 
the  regression  formulation.  Prediction  considering 
er ror-in-variabl es  is  not  well  practiced.  Ganse  et  al . 
[50]  made  predictions  of  earthquake  magnitudes  by 
employing  the  consideration  of  er ro r -in-var iables  . 


One  of  the  major  problems  encountered  in  aggregate 
analysis  was  "mix-normal"  data.  The  AADT  data  for  each 
Btation  has  a  normal  distribution.  But,  when  the  stations 
were  combined  in  aggregate  analysis  as  a  highway  category, 
the  AADT  data  failed  to  produce  a  normal  distribution, 
primarily  due  to  the  limited  number  of  count  stations. 
The  treatment  of  this  mixture  distribution  is  also  a  new 
area   is   statistical   science.   Kotz  et  al .  [51]  provided 
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some  useful  theoretical  discussion  of  this  mixture 
distribution.  Exhaustive  investigations  failed  to  find 
any  treatment  regarding  "mix-normal"  that  could  be 
directly  employed  in  prediction.  In  the  absence  of 
suitable  computer  program(s),  available  program(s)  could 
be  modified  using  the  theory  under  mixture  treatment.  If 
this  "mix-normal"  problem  in  the  aggregate  analysis  is 
solved,  then  the  aggregate  models  will  provide  better 
results  than  the  results  found  in  the  present  study.  In 
that  case,  it  will  also  reduce  the  necessity  to  increase 
the  number  of  count  stations. 

6.7  Conclus  ions 


The  principal  objective  of  this  report  was  to  develop 
simple,  fast  and  inexpensive  traffic  forecasting  models 
for  rural  state  highways  in  Indiana.  The  study  first 
identified  suitable  methodologies  and  then  applied 
statistical  analyses  to  find  suitable  variables  to  employ 
in  the  models.  The  analyses  done  to  develop  the 
elasticity-based  aggregate  and  disaggregate  models  are  as 
reliable  as  possible  within  the  limitations  of  the  data. 
The  developed  models  could  be  updated  as  new   data   become 

available.      The    developed    models    provide    better 

2 
statistical  results  (for  example,  R  )  than  those  found   in 

a    previous,   similar   study   [38].    Moreover,   variable 

selection  criteria  used  in  this  study  are  not  based  solely 
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on  stepwise  regression.  The  variables  used  in  the  models 
were  found  statistically  significant  and  it  was  found  that 
no  other  variables  will  provide  additional  significant 
predictive  power  in  the  models. 

The  step-by-step  instructions  in  Section  6.4  are 
provided  to  give  a  structured  approach  in  implementing  the 
models.  The  developed  models  are  expected  to  provide  a 
means  to  highway  planners  for  simple,  fast,  and 
inexpensive  estimation  of  future  traffic. 


In  almost  every  state,  the  task  of  traffic 
forecasting  for  the  rural  areas  is  heavily  dependent  on 
the  AADT  counts  at  continuous  count  stations.  Any  state 
with  adequate  historical  traffic  data  at  continuous  count 
stations  could  employ  this  model  building  approach  to 
determine  future  year  AADT  at  rural  locations. 

The  prediction  of  rural  traffic  volume  has  been 
relatively  neglected  despite  its  many  potential  uses.  The 
most  obvious  and  direct  use  of  rural  traffic  forecasting 
model  is  for  the  estimation  of  the  benefits  from  alternate 
highway  system  improvement  projects.  A  second  application 
would  be  as  an  aid  to  the  appropriate  design  of  a  project 
(for  example,  number  of  lanes  or  type  of  traffic  control). 
The  identification  of  potential  problem  segments  in  the 
state  highway  6ystem  could  be  accomplished  by  using  the 
models    to    identify    rapid    traffic    growth    areas. 
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Undoubtedly,  more  work  must  be  done  in  this  area  to 
improve  the  accuracy  and  reliability  of  a  traffic 
projection  model.  It  is  important  to  note  that  the 
developed  models  in  this  report  are  not  purported  to  be 
perfect  forecasting  tools,  if  such  a  model  could  ever 
exist.  Users  are  expected  to  weigh  the  results  in  terms 
of  the  local  situation,  and  make  adjustments  in  accordance 
with  their  professional  judgment.  Finally,  it  is  expected 
that  combining  different  methods  will  provide  more 
reliable  traffic  forecasts  to  highway  planners. 
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Station 
County 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


3245. 

21544. 

37.97 

1970. 

31382. 

9696. 

7710. 

3276. 

21837. 

36.88 

1971  . 

31400. 

9791  . 

7198. 

3478. 

23536. 

36.  13 

1972. 

32300. 

10167. 

8165. 

3977  . 

25371  . 

37  .60 

1973. 

33000. 

10486. 

8351  . 

3843. 

25356. 

46.32 

1974. 

33500. 

10747. 

9138. 

3739  . 

25653. 

45.14 

1975. 

33500. 

10851  . 

7545. 

25A 

3913. 

26833. 

42.75 

1976. 

33800. 

11056. 

8061. 

Noble 

4071  . 

28220. 

42.09 

1977  . 

34100. 

11264. 

9432. 

4251. 

29281. 

40.46 

1978. 

34600. 

11543. 

9894. 

4051  . 

29889. 

48.47 

1979  . 

35400. 

11929  . 

10029. 

3848. 

29979. 

59.45 

1980. 

35443. 

12065. 

9244. 

3885. 

29891  . 

56.40 

1981  . 

35000. 

12037  . 

9426. 

3898. 

29501. 

51.63 

1982. 

35300. 

12266. 

9129. 

3298. 

14036. 

37.97 

1970. 

21138. 

6454. 

4261  . 

3545. 

14392. 

36.88 

1971  . 

21700. 

6687  . 

4279. 

3620. 

15165. 

36.  13 

1972. 

22000. 

6843. 

4756. 

3634. 

16070. 

37  .60 

1973. 

22400. 

7033. 

5063. 

3554. 

16461. 

46.32 

1974. 

22900. 

7258. 

5263. 

3624. 

16959. 

45.14 

1975. 

23500. 

7520. 

5398. 

301A 

3840. 

17679. 

42.75 

1976. 

23700. 

7658. 

5554. 

Ripley 

3920. 

18397  . 

42.09 

1977  . 

24000. 

7831  . 

5694. 

4049. 

18973. 

40.46 

1978. 

24100. 

7941  . 

7027. 

4119. 

19503. 

48  .47 

1979  . 

24500. 

8154. 

7466. 

3845. 

19869. 

59.45 

1980. 

24398. 

8202. 

7287. 

3798. 

20223. 

56  .40 

1981  . 

24600. 

8354. 

7024. 

3740. 

20112. 

51  .63 

1982. 

24700. 

8475. 

7048. 

5572. 

26922. 

37.97 

1970. 

44176. 

12900. 

3693. 

5697  . 

28184. 

36.88 

1971  . 

44500. 

13146. 

3749. 

6049. 

30834. 

36.  13 

1972. 

44500. 

13300. 

4440. 

5850. 

33276. 

37.60 

1973. 

46600. 

14095. 

4732. 

5913. 

34425. 

46.32 

1974. 

47300. 

14479. 

4921. 

5970. 

35729. 

45.  14 

1975. 

48100. 

14904. 

4939. 

313A 

6079. 

37434. 

42.75 

1976. 

48500. 

15214. 

5433. 

Morgan 

5895. 

39589  . 

42.09 

1977. 

49800. 

15817. 

6010. 

5980. 

41534. 

40.46 

1978. 

50600. 

16275. 

8286. 

6010. 

42775. 

48.47 

1979. 

51200. 

16680. 

8612. 

5650. 

43414. 

59.45 

1980. 

51999. 

17160. 

8430. 

5631  . 

43802. 

56.40 

1981  . 

52500. 

17554. 

8271. 

5565. 

43702. 

51  .63 

1982. 

52600. 

17822. 

8317. 

Table    A3    (continued) 
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Station 
County 


(1) 


(2) 


(3) 


(4)    (5) 


(6) 


(7) 


(8) 


2571  . 

15947  . 

37  .97 

1970. 

20995. 

6872. 

3806. 

2439. 

16164. 

36.88 

1971. 

21300. 

7051. 

3873. 

2452. 

17082. 

36.13 

1972  . 

21300. 

7132. 

4856. 

2512. 

18200. 

37.60 

1973. 

21900. 

7417. 

5322. 

2389. 

18756. 

46.32 

1974. 

22000. 

7539. 

5340. 

2415. 

19165. 

45.  14 

1975. 

22100. 

7663. 

5092. 

262A 

2564. 

20024. 

42.75 

1976. 

22600. 

7931  . 

5672. 

White 

2555. 

20852. 

42.09 

1977. 

22900. 

8134. 

5609. 

2687  . 

21497  . 

40.46 

1978. 

23300. 

8378. 

7071. 

2527. 

22007. 

48.47 

1979. 

23400. 

8518. 

7637. 

2444. 

22321  . 

59.45 

1980. 

23867. 

8798. 

7247. 

2460. 

22634. 

56.40 

1981. 

23800. 

8885. 

7305. 

2436. 

22579. 

51  .63 

1982. 

24000. 

9076. 

6862. 

a  =  y 

e  = 

X4 

b  «=  X 

f  = 

X5 

c  =  X2 

g  = 

X6 

d  =  X, 

Note:  For  the  meaning  and  definition  of  each  variable, 
see  Table  4.1  and  Chapter  4  in  the  text. 
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Data  Table  for  Rural  Major  Collector 
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Station 
County 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


47A 
Randolph 


1064. 
1115. 
1236. 
1118. 
1117. 
1093. 
1064. 
1057  . 
1  159. 
1161  . 
1132. 


19885. 
20226. 
21122. 
22220. 
22770. 
23153. 
23697. 
25175. 
25362. 
25785. 
25580. 


37, 

36 

36, 

37 

46, 

45 

42, 

42 

40, 

48 

51, 


97 
88 
13 
60 
32 
14 
75 
09 
46 
47 
63 


1970. 
1971  . 
1972. 
1973. 
1974. 
1975. 
1976. 
1977  . 
1978. 
1979. 
1982. 


28915. 
29500. 
29600. 
29900. 
29800. 
29800. 
29800. 
30200. 
29800. 
30100. 
29000. 


9645. 

9905. 
10005. 
10174  . 
10208. 
10278. 
10348. 
10559. 
10491. 
10671  . 
10501. 


6884, 
6876, 
7852, 
8495, 
8593, 
7517 
7544, 
8555 
10037 
9634 
7848 


3667. 

23174. 

37.97 

1970. 

35096. 

10792. 

3676. 

3850. 

24314. 

36  .88 

1971  . 

35200. 

10896. 

3935. 

4083. 

26423. 

36.  13 

1972. 

36700. 

11437. 

4756. 

4250. 

28570. 

37  .60 

1973. 

38400. 

12048. 

5359. 

4290. 

29951. 

46.32 

1974. 

39600. 

12509. 

5480. 

4391  . 

31284. 

45.14 

1975. 

40100. 

12754. 

5256. 

59A 

4634. 

32485. 

42.75 

1976. 

40700. 

13034. 

5588. 

Hancock 

4420. 

35745. 

42.09 

1977  . 

41400. 

13351  . 

5889. 

4707. 

37448. 

40.46 

1978. 

42100. 

13672. 

7981. 

4707  . 

38695. 

48.47 

1979  . 

43200. 

14128. 

8397. 

4660. 

38017. 

59.45 

1980. 

43939. 

14472. 

8432. 

4346. 

38234. 

56.40 

1981  . 

43900. 

14563. 

8038. 

4368. 

38447. 

51.63 

1982. 

43800. 

14634. 

7769. 

1722. 

22561. 

37.97 

1970. 

33930. 

11044. 

7516. 

1719  . 

231  15. 

36.88 

1971  . 

34300. 

11287. 

7368. 

1845. 

24694. 

36.  13 

1972. 

34600. 

11513. 

8644. 

1859. 

26075. 

37  .60 

1973. 

35000. 

11777. 

9009. 

1772. 

26619. 

46.32 

1974. 

35200. 

11979. 

9308. 

1815. 

27266. 

45.14 

1975. 

35400. 

12186. 

9195. 

5420A 

1863. 

28460. 

42.75 

1976. 

35400. 

12328. 

9667. 

Mont  go  - 

1901  . 

28696. 

42.09 

1977  . 

35500. 

12508. 

9858. 

me  ry 

2248. 

29514. 

40.46 

1978. 

35600. 

12693. 

12251. 

2586. 

29760. 

48.47 

1979  . 

35600. 

12846. 

12434. 

2139. 

30356. 

59.45 

1980. 

35501. 

12967. 

11852. 

1970. 

30412. 

56.40 

1981  . 

34900. 

12905. 

11663. 

1890. 

30148. 

51.63 

1982. 

35300. 

13216. 

11313. 

a  -  y 


d  -  X 


f  -  X. 


g 


1        "2        "3        "4         5         6 

Note:  For  the  meaning  and  definition  of  each  variable, 

see  Table  4.1  and  Chapter  4  in  the  text. 
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Appendix  B 


Scatter  Plots : 


Aggregate  Analysis 


1.  Rural  Interstate:  Figure  Bl.l  to  Figure  B1.13 

2.  Rural  Principal  Arterial:  Figure  B2.1  to  Figure  B2.13 

3.  Rural  Minor  Arterial:  Figure  B3.1  to  Figure  B3.6 

4.  Rural  Major  Collector:  Figure  B4.1  to  Figure  B4.6 


" 
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FIGURE  Bl.l:  nnDT  VS.  COUNTY  VEHICLE  REGISTRATION 
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FIGURE  B1.2:   RflDT  VS.  US  GRSOLINE  PRICE 
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FIGURE  B1.3:  RRDT  VS.  YEAR 
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FIGURE  B1.4:  RRDT  VS.  COUNTY  POPULATION 
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FIGURE  B1.5:  flflDT  VS.  COUNTY  HOUSEHOLDS 
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FIGURE  B1.6:   flflDT   VS.  COUNTY  EMPLOYMENT 
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FIGURE  B1.7:   flRDT  VS.  STRTE  VEHICLE  REGISTRATION 
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FIGURE  Bl .8s  RRDT  VS.  STRTE  POPULATION 
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FIGURE  B1.9:  RRDT  VS.  STATE  HOUSEHOLDS 
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FIGURE  Bl.iO:  RRDT  VS.  STRTE  EMPLOYMENT 
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FIGURE  Bl.ll:  AADT  VS.  CONSUMER  PRICE  INDEX 
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FIGURE  Bl  .12:  RflDT  VS.  GROSS  NATIONAL  PRODUCT 
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FIGURE  Bl  .13:  AADT  VS.  PER  CAPITA  NATIONAL  INCOME 
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FIGURE  B2.1:   AADT  VS.  COUNTY  VEHICLE  REGISTRATION 
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FIGURE  B2.2:  RRDT  VS.  US  GASOLINE  PRICE 


12.700 


11.300  -- 


cr 

CO 

=D       9.900  1 


8.500  -- 


D 
d 
cr 


7.100 


m 


a 


m 


a 


a 


a 


□ 


□ 


□ 


a 


a 
m 


n 


S 


u 


a 


D 


Cb 


m 
a 


□ 


5.700  -I BH 1 h 


□ 


□ 


□ 

m 

□ 

m 

□ 

n 

□ 

CD 

— I 1 1 1 1 1 

36.100  HO. 900  45.700  50.500  55.300  60, 

US  GASOLINE  PRICE  IN  CENTS  PER  GALLON  OF  1972  $ 


100 


FIGURE  B2.3:  AADT  VS.  YEAR 
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FIGURE  B2.H:  AADT  VS.  COUNTY  POPULATION 
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FIGURE  B2.5:  AADT  VS.  COUNTY  HOUSEHOLDS 
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FIGURE  B2.6:  RADT  VS.  COUNTY  EMPLOYMENT 
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FIGURE  B2.7:   RflDT  VS.  STATE  VEHICLE  REGISTRATION 
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FIGURE  B2.9:  AADT  VS.  STATE  HOUSEHOLDS 
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FIGURE  B2.10:  AADT  VS.  STATE  EMPLOYMENT 
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FIGURE  B2.ll:  AADT  VS.  CONSUMER  PRICE  INDEX 
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FIGURE  B2.12:  RRDT  VS.  GROSS  NATIONAL  PRODUCT 
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FIGURE  B2.13:  AADT  VS.  PER  CAPITA  NATIONAL  INCOME 
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FIGURE  B3.1:  AADT  VS.  COUNTY  VEHICLE  REGISTRATION 
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FIGURE  B3.2:  AADT  VS.  US  GASOLINE  PRICE 
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FIGURE  B3.5:  AADT  VS.  COUNTY  HOUSEHOLDS 
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FIGURE  B4.1:  AADT  VS.  COUNTY  VEHICLE  REGISTRATION 
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Figure  C3.2:  Normal  Probability  Plot  of  Residuals 
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Figure  C3.3:  Normal  Probability  Plot  of  Residuals 
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1.  Station  68A:  Figure  Dl  to  Figure  Dl 1 

2.  Station  7047A:  Figure  D12  to  Figure  D17 
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Figure    D2:     AADT    vs.     US    Gasoline    Price 
(Station    68A) 
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Figure  DA:  AADT  vs.  County  Population 
(Station  68A) 


22A 


cr 
to 


□ 
cr 
cr 


o.uuu  - 

« 

7.540  - 

♦ 

« 

♦ 

« 

e 

* 

7.06G  - 

♦ 

« 

6.620  - 

* 

«> 

6.160  - 

* 

5.700  ■ 

» 

—) 

1- 

— ( 

1 — 

1 

1 

— 1 

H 

— ( — 

COUNTY   HOUSEHOLDS,    IN   THOUSANDS 

Figure  D5:  AADT  vs.  County  Households 
(Station  68A) 


8.000 


7.5M0  •• 


O 

o 

O       7.080 


o 
cr 
cr 


6.620  -- 


6.160  + 


A 
A 


6.700  A 1 1 1  I 1 1 1 1 ( 

6.200  6.800  7.HO0  8.000  8.800  9.800 

YEARLY  AUG.   OF  COUNTY  COVERED  EMPLOYMENT   (1000) 


Figure  D6:  AADT  vs.  County  Employment 
(Station  68A) 
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Figure  D9 :  AADT  vs.  State  Households 
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Figure  D10:  AADT  vs.  State  Employment 
(Station  68A) 
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Figure    Dll:     AADT    ve.     Per    Capita    National    Income 

(Station    68A) 
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Figure    D13:     AADT    vs.     US    Gasoline    Price 
(Station    7047A) 
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Figure  D14:  AADT  vs.  Year 
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Figure  D15:  AADT  vs.  County  Population 
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Figure  D17:  AADT  vs.  County  Employment 
(Station  7047A) 
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Residual  Plots 


Disaggregate  Analysis 
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Figure  El.l:  Residual  Plot  against  Y 
(Station  3070A) 
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(Station  68A) 
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Figure  El. 3:  Residual  Plot  against  Y 
(Station  301A) 
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Figure  El. 4:  Residual  Plot  against  Y 
(Station  7047A) 
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Figure  E2.1.1:  Residual  Plot  against  X, 
(Station  3070A) 
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Figure  E2.1.2:  Residual  Plot  against  Xg 
(Station  3070A) 
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Figure  E2.2.1:  Residual  Plot  against  X. 


(Station  68A) 
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Figure  E2.3.1:  Residual  Plot  against  XJ 
(Station  301A) 
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Figure  E2.3.2:  Residual  Plot  against  X, 


(Station  301A) 
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Figure  E2.4.1:  Residual  Plot  against  X 
(Station  7047A) 
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Figure  E3.1:  Normal  Probability  Plot  of  Residuals 

(Station  3070A) 
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Figure  E3.2:  Normal  Probability  Plot  of  Residuals 

(Station  68A) 
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Figure  E3.3:  Normal  Probability  Plot  of  Residuals 

(Station  301A) 


2A0 


1.2   + 


e     .40 


r    0.0 


.40 


-.80 


■1.2   + 


-1.6   + 


Figure  E3.4:  Normal  Probability  Plot  of  Residuals 

(Station  70A7A) 
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Figure  E4.1:  Residual  Plot  against  X 
(Station  3070A)  3 


125   + 


R 

e 

6 
i 

d 
u 

a 
1 


0.0 


-125   + 


1971.3     1973.8     1976.3     1978.8     1981.3 
1970.0     1972.5     1975.0     1977.5     1980.0 

Year  (Xj) 


Figure  E4.2:  Residual  Plot  against  X. 
(Station  68A) 
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Figure  E4.3:  Residual  Plot  against  X 
(Station  301A)  3 
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Figure  E4.A:  Residual  Plot  against  X. 
(Station  7047A) 
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Examples  on  Simple  Extrapolation 
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Figure    Fl:     Simple    Extrapolation    of    AADT 
(Station    59A) 
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Figure    F2:     Simple    Extrapolation    of    AADT 
(Station    68A) 
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Figure    F3:     Simple    Extrapolation    of    AADT 
(Station    173A) 
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Figure    F4:     Simple    Extrapolation    of    AADT 
(Station    256A) 
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Appendix  £ 


Statistical  Test  for  Equality  of 
Two  Population  Means 
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Test  for  Equality  of  Two  Population  Means  [37] 


Hypotheses 


H 


yi  =  U2 

Pj  *  y2 


where,  y   and  u   are  two  normal  population  means.  Here,  H 

asserts   that  the  two  population  means  are  the  same,  while 

H   presumes  they  are  not  the  same. 
a 


Evaluation  of  test  statistic 

Let  X   and  X   be  the  sample  means  of  two   independent 
12 

samples.    Estimators   of  the  two  population  means  are  the 
sample  means,  which  are  calculated  as  follows: 


1 

E  X 

1-1  jl 

n  , 


n 


and 


2 

1    X2 
i=l   i 


where  n   and  n   are  the  number   of   samples  for   the   two 

samples.     The    estimator    of   u   -  u   is  X"   -  X"  .    An 

12  12 

2  2 

estimator  of  the  common  variance  o  ,  denoted  by  s  ,is: 


•=;  ,2 


=  x2 


E(X    -  X  )'  +  E(X.   -  X.) 

nl  +  n2  ~  2 


2  —     —  ?    —  — 

An  estimator  of  o  (X   -  X  ),  denoted  by   s  (X   -  X  ) 

the  variance  of  sampling  distribution  of  Y   -  X  ,  is: 
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s2(X1  -  X2) 


-1   +  JL 

nl    n2 


y    -  x 

*       1     2 

Then,  test  statistic  t   =  — — — — 

s(X1  -  X2) 


Deci  s  ion  Rule 


Let   t -value 


t(  1  -  -;    ul    +  n2  -  2) .     Now,    if 


t    <  t-value,   conclude   H  ,   i.e.,  two  population  means 

are  same.  Otherwise   conclude   H  ,   i.e.,   two   population 

a 

means  are  not  same.  Here  a  is  the  level  of  significance 
(or  degree  of  uncertainty).  A  value  of  5  percent  could  be 
recommended  for  o.  The  term  "n  +  n  -  2"  is  known  as 
degrees  of  freedom,  where  2  degrees  of  freedom  were  lost 
to  estimate  two  sample  means. 


Exampl e 


Two  Rural  Principal  Arterial  stations  (68A  and  254B) 
are  used  to  demonstrate  the  principles  described  above. 
The  data  for  this  example  are  taken  from  Table  A2  in 
Appendix  A.  Let  the  data  of  stations  68A  and  254B 
represent  samples  of  populations,  indicated  by  the 
subscripts  1  and  2  in  the  discussion  above.  The  values  of 
the  pertinent  statistics  and  the  decisions  for  the 
response  variable  AADT  and  the  county  level  predictor 
variables  are  shown  in  Table  Gl .  If  the  population  means 
of   the   response   variable  (AADT  in  this  case)  and  of  the 
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Table  Gl 
Tests  for  Equality  of  Variables  Means  for  Two  Locations 


Variable 

Key  Statistics 

Conclusion 

ni=13.     n2  =  13 

yl  =  71M'     x2    =  ?533 

ftADT 

s  2  =  520147 

aadt  of  two  stations 

1  I*  I    =  1-517 

are  sane 

t-value  =  2.06a 

n  1  =  13,     n2  =  13 

County 
Vehicle 

-     =  24202,     -       =  30363 
*1                     '2 

I2    =  11917193 

County  Vehicle  Registrations 

Registrations 

|t»  |   =0.565 

of  two  counties 

t-value  =  2.06A 

are  not  sane 

n  1  =  13,     n  2  =  13 

I     =  31940,     Zn    =  37795 
xl                    *2 

County  Population 

County 
Population 

S2     =  2951302 

I  t*  1    =  6-669 

t-value  =  2.064 

of  two  counties 
are  not  sane 

ni=13,     n2  =  13 

Z    =  10350,     I      =  12662 

County  Households 

County 
Households 

c,2      =  961134 

It*  I    =  6.012 

t-value  =  2.064 

of  two  counties 
are  not  sane 

ni=13,     n2  =  13 

County 
Employment 

X      =  7621,      x      ■  I0319 
S2     =  2927404 

|  t*  1    =  «.013 
t-value  =  2.064 

County  Enploynent 

of  two  counties 

are  not  sane 
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county  level  predictor  variables  (employed  in  the  proposed 
disaggregate  model  at  one  location)  for  the  two  locations 
are  statistically  the  same,  then  the  locations  are 
"similar"  and  the  disaggregate  model  is  applicable  at  both 
locations.  In  Table  Gl,  however,  none  of  the  predictor 
variables  are  statistically  the  same  for  the  two  stations. 
Thus,  the  stations  are  not  "similar"  and  the  disaggregate 
model  developed  for  one  station  is  not  applicable  at  the 
other  station. 


